FastDFS—基础

一、分布式文件系统

  1、什么是分布式文件系统

    随着文件数据的越来越多,通过tomcat或nginx虚拟化的静态资源文件在单一的一个服务器节点内是存不下的,如果用多个节点来存储也可以,但是不利于管理和维护,所以我们需要一个系统来管理多台计算机节点上的文件数据,这就是分布式文件系统。

    分布式文件系统是一个允许文件通过网络在多台节点上分享的文件系统,多台计算机节点共同组成一个整体,为更多的用户提供分享文件和存储空间。比如常见的网盘,本质就是一个分布式的文件存储系统。虽然我们是一个分布式的文件系统,但是对用户来说是透明的,用户使用的时候,就像是访问本地磁盘一样。

    分布式文件系统可以提供冗余备份,所以容错能力很高。系统中有某些节点宕机,但是整体文件服务不会停止,还是能够为用户提供服务,整体还是运作的,数据也不会丢失。

    分布式文件系统的可扩展性强,增加或减少节点都很简单,不会影响线上服务,增加完毕后会发布到线上,加入到集群中为用户提供服务。

    分布式文件系统可以提供负载均衡能力,在读取文件副本的时候可以由多个节点共同提供服务,而且可以通过横向扩展来确保性能的提升与负载。

二、fastDFS介绍

  1、简介

  FastDFS 是一个开源的高性能分布式文件系统(DFS)。 它的主要功能包括:文件存储,文件同步和文件访问,以及高容量和负载平衡。主要解决了海量数据存储问题,特别适合以中小文件(建议范围:4KB < file_size <500MB)为载体的在线服务。

FastDFS 系统有三个角色:跟踪服务器(Tracker Server)、存储服务器(Storage Server)和客户端(Client)。

  Tracker Server:跟踪服务器,主要做调度工作,起到均衡的作用;负责管理所有的 storage server和 group,每个 storage 在启动后会连接 Tracker,告知自己所属 group 等信息,并保持周期性心跳。

  Storage Server:存储服务器,主要提供容量和备份服务;以 group 为单位,每个 group 内可以有多台 storage server,数据互为备份。

  Client:客户端,上传下载数据的服务器,也就是我们自己的项目所部署在的服务器。

  

  2、fastDFS 存储策略

  为了支持大容量,存储节点(服务器)采用了分卷(或分组)的组织方式。存储系统由一个或多个卷组成,卷与卷之间的文件是相互独立的,所有卷的文件容量累加就是整个存储系统中的文件容量。一个卷可以由一台或多台存储服务器组成,一个卷下的存储服务器中的文件都是相同的,卷中的多台存储服务器起到了冗余备份和负载均衡的作用。
  在卷中增加服务器时,同步已有的文件由系统自动完成,同步完成后,系统自动将新增服务器切换到线上提供服务。当存储空间不足或即将耗尽时,可以动态添加卷。只需要增加一台或多台服务器,并将它们配置为一个新的卷,这样就扩大了存储系统的容量。

  3、fastDFS 上传过程

  FastDFS向使用者提供基本文件访问接口,比如upload、download、append、delete等,以客户端库的方式提供给用户使用。
  Storage Server会定期的向Tracker Server发送自己的存储信息。当Tracker Server Cluster中的Tracker Server不止一个时,各个Tracker之间的关系是对等的,所以客户端上传时可以选择任意一个Tracker。
当Tracker收到客户端上传文件的请求时,会为该文件分配一个可以存储文件的group,当选定了group后就要决定给客户端分配group中的哪一个storage server。当分配好storage server后,客户端向storage发送写文件请求,storage将会为文件分配一个数据存储目录。然后为文件分配一个fileid,最后根据以上的信息生成文件名存储文件。

  

   4、fastDFS 文件下载

  客户端uploadfile成功后,会拿到一个storage生成的文件名,接下来客户端根据这个文件名即可访问到该文件。

  

   跟upload file一样,在downloadfile时客户端可以选择任意tracker server。tracker发送download请求给某个tracker,必须带上文件名信息,tracke从文件名中解析出文件的group、大小、创建时间等信息,然后为该请求选择一个storage用来服务读请求。

三、安装 fastDFS 环境

  1、下载安装 libfastcommon,并解压

wget https://github.com/happyfish100/libfastcommon/archive/V1.0.7.tar.gz

  

     进入到 libfastcommon-1.0.7 中

  

     编译、安装

  

  

     创建软连接

ln -s /usr/lib64/libfastcommon.so /usr/local/lib/libfastcommon.so
ln -s /usr/lib64/libfastcommon.so /usr/lib/libfastcommon.so
ln -s /usr/lib64/libfdfsclient.so /usr/local/lib/libfdfsclient.so
ln -s /usr/lib64/libfdfsclient.so /usr/lib/libfdfsclient.so

  

  2、安装 fastDFS

wget https://github.com/happyfish100/fastdfs/archive/V5.05.tar.gz

     解压

 tar -zxvf V5.05.tar.gz 
 cd cd fastdfs-5.05/

     编译安装

./make.sh
./make.sh install 

     进入 /usr/bin,全是脚本文件 

  

     进入 /etc/fdfs/,配置文件

  

     进入 /etc/init.d,服务脚本文件

  

     fastDFS 服务脚本设置的 bin 目录是 /usr/local/bin,但实际命令安装在 /usr/bin/ 下。

    建立 /usr/bin 到 /usr/local/bin 的软连接

 ln -s /usr/bin/fdfs_trackerd   /usr/local/bin
 ln -s /usr/bin/fdfs_storaged   /usr/local/bin
 ln -s /usr/bin/stop.sh         /usr/local/bin
 ln -s /usr/bin/restart.sh      /usr/local/bin

  

四、配置 fastDFS 跟踪器(Tracker)

  1、进入 /etc/fdfs 

  

   2、复制 tracker.conf.sample,并重命名 tracker.conf

  

   3、编辑 tracker.conf,其它的默认即可

# is this config file disabled
# false for enabled
# true for disabled
disabled=false

# bind an address of this host
# empty for bind all addresses of this host
bind_addr=

# the tracker server port
port=22122

# connect timeout in seconds
# default value is 30s
connect_timeout=30

# network timeout in seconds
# default value is 30s
network_timeout=60

# the base path to store data and log files
# Tracker 数据和日志目录地址(根目录必须存在,子目录会自动创建)
base_path=/usr/local/fastdfs/tracker

# max concurrent connections this server supported
max_connections=256

# accept thread count
# default value is 1
# since V4.07
accept_threads=1

# work thread count, should <= max_connections
# default value is 4
# since V2.00
work_threads=4

# the method of selecting group to upload files
# 0: round robin
# 1: specify group
# 2: load balance, select the max free space group to upload file
store_lookup=2

# which group to upload file
# when store_lookup set to 1, must set store_group to the group name
store_group=group2

# which storage server to upload file
# 0: round robin (default)
# 1: the first server order by ip address
# 2: the first server order by priority (the minimal)
store_server=0

# which path(means disk or mount point) of the storage server to upload file
# 0: round robin
# 2: load balance, select the max free space path to upload file
store_path=0

# which storage server to download file
# 0: round robin (default)
# 1: the source storage server which the current file uploaded to
download_server=0

# reserved storage space for system or other applications.
# if the free(available) space of any stoarge server in 
# a group <= reserved_storage_space, 
# no file can be uploaded to this group.
# bytes unit can be one of follows:
### G or g for gigabyte(GB)
### M or m for megabyte(MB)
### K or k for kilobyte(KB)
### no unit for byte(B)
### XX.XX% as ratio such as reserved_storage_space = 10%
reserved_storage_space = 10%

#standard log level as syslog, case insensitive, value list:
### emerg for emergency
### alert
### crit for critical
### error
### warn for warning
### notice
### info
### debug
log_level=info

#unix group name to run this program, 
#not set (empty) means run by the group of current user
run_by_group=

#unix username to run this program,
#not set (empty) means run by current user
run_by_user=

# allow_hosts can ocur more than once, host can be hostname or ip address,
# "*" means match all ip addresses, can use range like this: 10.0.1.[1-15,20] or
# host[01-08,20-25].domain.com, for example:
# allow_hosts=10.0.1.[1-15,20]
# allow_hosts=host[01-08,20-25].domain.com
allow_hosts=*

# sync log buff to disk every interval seconds
# default value is 10 seconds
sync_log_buff_interval = 10

# check storage server alive interval seconds
check_active_interval = 120

# thread stack size, should >= 64KB
# default value is 64KB
thread_stack_size = 64KB

# auto adjust when the ip address of the storage server changed
# default value is true
storage_ip_changed_auto_adjust = true

# storage sync file max delay seconds
# default value is 86400 seconds (one day)
# since V2.00
storage_sync_file_max_delay = 86400

# the max time of storage sync a file
# default value is 300 seconds
# since V2.00
storage_sync_file_max_time = 300

# if use a trunk file to store several small files
# default value is false
# since V3.00
use_trunk_file = false 

# the min slot size, should <= 4KB
# default value is 256 bytes
# since V3.00
slot_min_size = 256

# the max slot size, should > slot_min_size
# store the upload file to trunk file when it's size <=  this value
# default value is 16MB
# since V3.00
slot_max_size = 16MB

# the trunk file size, should >= 4MB
# default value is 64MB
# since V3.00
trunk_file_size = 64MB

# if create trunk file advancely
# default value is false
# since V3.06
trunk_create_file_advance = false

# the time base to create trunk file
# the time format: HH:MM
# default value is 02:00
# since V3.06
trunk_create_file_time_base = 02:00

# the interval of create trunk file, unit: second
# default value is 38400 (one day)
# since V3.06
trunk_create_file_interval = 86400

# the threshold to create trunk file
# when the free trunk file size less than the threshold, will create 
# the trunk files
# default value is 0
# since V3.06
trunk_create_file_space_threshold = 20G

# if check trunk space occupying when loading trunk free spaces
# the occupied spaces will be ignored
# default value is false
# since V3.09
# NOTICE: set this parameter to true will slow the loading of trunk spaces 
# when startup. you should set this parameter to true when neccessary.
trunk_init_check_occupying = false

# if ignore storage_trunk.dat, reload from trunk binlog
# default value is false
# since V3.10
# set to true once for version upgrade when your version less than V3.10
trunk_init_reload_from_binlog = false

# the min interval for compressing the trunk binlog file
# unit: second
# default value is 0, 0 means never compress
# FastDFS compress the trunk binlog when trunk init and trunk destroy
# recommand to set this parameter to 86400 (one day)
# since V5.01
trunk_compress_binlog_min_interval = 0

# if use storage ID instead of IP address
# default value is false
# since V4.00
use_storage_id = false

# specify storage ids filename, can use relative or absolute path
# since V4.00
storage_ids_filename = storage_ids.conf

# id type of the storage server in the filename, values are:
## ip: the ip address of the storage server
## id: the server id of the storage server
# this paramter is valid only when use_storage_id set to true
# default value is ip
# since V4.03
id_type_in_filename = ip

# if store slave file use symbol link
# default value is false
# since V4.01
store_slave_file_use_link = false

# if rotate the error log every day
# default value is false
# since V4.02
rotate_error_log = false

# rotate error log time base, time format: Hour:Minute
# Hour from 0 to 23, Minute from 0 to 59
# default value is 00:00
# since V4.02
error_log_rotate_time=00:00

# rotate error log when the log file exceeds this size
# 0 means never rotates log file by log file size
# default value is 0
# since V4.02
rotate_error_log_size = 0

# keep days of the log files
# 0 means do not delete old log files
# default value is 0
log_file_keep_days = 0

# if use connection pool
# default value is false
# since V4.05
use_connection_pool = false

# connections whose the idle time exceeds this time will be closed
# unit: second
# default value is 3600
# since V4.05
connection_pool_max_idle_time = 3600

# HTTP port on this tracker server
http.server_port=8080

# check storage HTTP server alive interval seconds
# <= 0 for never check
# default value is 30
http.check_alive_interval=30

# check storage HTTP server alive type, values are:
#   tcp : connect to the storge server with HTTP port only, 
#        do not request and get response
#   http: storage check alive url must return http status 200
# default value is tcp
http.check_alive_type=tcp

# check storage HTTP server alive uri/url
# NOTE: storage embed HTTP server support uri: /status.html
http.check_alive_uri=/status.html

 创建目录(-p递归创建)

mkdir /usr/local/fastdfs/tracker -p

  4、启动 Tracker

# 第一种
    /etc/init.d/fdfs_trackerd start
# 也可以用这种方式启动,前提是上面创建了软链接,后面都用这种方式
    service fdfs_trackerd start

  

     Tracker服务启动成功后,会在base_path下创建data、logs两个目录。目录结构如下:

${base_path}
  |__data
  |   |__storage_groups.dat:存储分组信息
  |   |__storage_servers.dat:存储服务器列表
  |__logs
  |   |__trackerd.log: tracker server 日志文件 

  5、关闭 Tracker 命令

 service fdfs_trackerd stop

  6、设置 Tracker 开机启动

chkconfig fdfs_trackerd on 

    或者,前提需要添加权限

chmod +x /etc/rc.d/rc.local
vim /etc/rc.d/rc.local

# 加入配置:
/etc/init.d/fdfs_trackerd start

五、配置 FastDFS 存储 (Storage)

  1、复制 Storage.conf

    

    

# is this config file disabled
# false for enabled
# true for disabled
disabled=false

# the name of the group this storage server belongs to
#
# comment or remove this item for fetching from tracker server,
# in this case, use_storage_id must set to true in tracker.conf,
# and storage_ids.conf must be configed correctly.
# 指定此 storage server 所在 组(卷)
group_name=group1

# bind an address of this host
# empty for bind all addresses of this host
bind_addr=

# if bind an address of this host when connect to other servers 
# (this storage server as a client)
# true for binding the address configed by above parameter: "bind_addr"
# false for binding any address of this host
client_bind=true

# the storage server port
port=23000

# connect timeout in seconds
# default value is 30s
connect_timeout=30

# network timeout in seconds
# default value is 30s
network_timeout=60

# heart beat interval in seconds
# 心跳间隔时间,单位为秒 (这里是指主动向 tracker server 发送心跳) heart_beat_interval
=30 # disk usage report interval in seconds stat_report_interval=60 # the base path to store data and log files base_path=/usr/local/fastdfs/storage # max concurrent connections the server supported # default value is 256 # more max_connections means more memory will be used max_connections=256 # the buff size to recv / send data # this parameter must more than 8KB # default value is 64KB # since V2.00 buff_size = 256KB # accept thread count # default value is 1 # since V4.07 accept_threads=1 # work thread count, should <= max_connections # work thread deal network io # default value is 4 # since V2.00 work_threads=4 # if disk read / write separated ## false for mixed read and write ## true for separated read and write # default value is true # since V2.00 disk_rw_separated = true # disk reader thread count per store base path # for mixed read / write, this parameter can be 0 # default value is 1 # since V2.00 disk_reader_threads = 1 # disk writer thread count per store base path # for mixed read / write, this parameter can be 0 # default value is 1 # since V2.00 disk_writer_threads = 1 # when no entry to sync, try read binlog again after X milliseconds # must > 0, default value is 200ms sync_wait_msec=50 # after sync a file, usleep milliseconds # 0 for sync successively (never call usleep) sync_interval=0
# 允许系统同步的时间段 (默认是全天) 。一般用于避免高峰同步产生一些问题而设定。 # storage sync start time of a day, time format: Hour:Minute # Hour from
0 to 23, Minute from 0 to 59 sync_start_time=00:00 # storage sync end time of a day, time format: Hour:Minute # Hour from 0 to 23, Minute from 0 to 59 sync_end_time=23:59 # write to the mark file after sync N files # default value is 500 write_mark_file_freq=500 # path(disk or mount point) count, default value is 1
# 存放文件时 storage server 支持多个路径。这里配置存放文件的基路径数目,通常只配一个目录。 store_path_count=1 # store_path#, based 0, if store_path0 not exists, it's value is base_path # the paths must be exist store_path0=/usr/local/fastdfs/storage #store_path1=/home/yuqing/fastdfs2 # subdir_count * subdir_count directories will be auto created under each # store_path (disk), value can be 1 to 256, default value is 256
# FastDFS 存储文件时,采用了两级目录。这里配置存放文件的目录个数。
# 如果本参数只为 N(如: 256),那么 storage server 在初次运行时,会在 store_path 下自动创建 N * N 个存放文件的子目录。 subdir_count_per_path=256 # tracker_server can ocur more than once, and tracker_server format is # "host:port", host can be hostname or ip address # 用于和tarcker通信,心跳
# tracker_server 的列表 ,会主动连接 tracker_server
# 有多个 tracker server 时,每个 tracker server 写一行 tracker_server
=192.168.0.44:22122 #standard log level as syslog, case insensitive, value list: ### emerg for emergency ### alert ### crit for critical ### error ### warn for warning ### notice ### info ### debug log_level=info #unix group name to run this program, #not set (empty) means run by the group of current user run_by_group= #unix username to run this program, #not set (empty) means run by current user run_by_user= # allow_hosts can ocur more than once, host can be hostname or ip address, # "*" means match all ip addresses, can use range like this: 10.0.1.[1-15,20] or # host[01-08,20-25].domain.com, for example: # allow_hosts=10.0.1.[1-15,20] # allow_hosts=host[01-08,20-25].domain.com allow_hosts=* # the mode of the files distributed to the data path # 0: round robin(default) # 1: random, distributted by hash code file_distribute_path_mode=0 # valid when file_distribute_to_path is set to 0 (round robin), # when the written file count reaches this number, then rotate to next path # default value is 100 file_distribute_rotate_count=100 # call fsync to disk when write big file # 0: never call fsync # other: call fsync when written bytes >= this bytes # default value is 0 (never call fsync) fsync_after_written_bytes=0 # sync log buff to disk every interval seconds # must > 0, default value is 10 seconds sync_log_buff_interval=10 # sync binlog buff / cache to disk every interval seconds # default value is 60 seconds sync_binlog_buff_interval=10 # sync storage stat info to disk every interval seconds # default value is 300 seconds sync_stat_file_interval=300 # thread stack size, should >= 512KB # default value is 512KB thread_stack_size=512KB # the priority as a source server for uploading file. # the lower this value, the higher its uploading priority. # default value is 10 upload_priority=10 # the NIC alias prefix, such as eth in Linux, you can see it by ifconfig -a # multi aliases split by comma. empty value means auto set by OS type # default values is empty if_alias_prefix= # if check file duplicate, when set to true, use FastDHT to store file indexes # 1 or yes: need check # 0 or no: do not check # default value is 0 check_file_duplicate=0 # file signature method for check file duplicate ## hash: four 32 bits hash code ## md5: MD5 signature # default value is hash # since V4.01 file_signature_method=hash # namespace for storing file indexes (key-value pairs) # this item must be set when check_file_duplicate is true / on key_namespace=FastDFS # set keep_alive to 1 to enable persistent connection with FastDHT servers # default value is 0 (short connection) keep_alive=0 # you can use "#include filename" (not include double quotes) directive to # load FastDHT server list, when the filename is a relative path such as # pure filename, the base path is the base path of current/this config file. # must set FastDHT server list when check_file_duplicate is true / on # please see INSTALL of FastDHT for detail ##include /home/yuqing/fastdht/conf/fdht_servers.conf # if log to access log # default value is false # since V4.00 use_access_log = false # if rotate the access log every day # default value is false # since V4.00 rotate_access_log = false # rotate access log time base, time format: Hour:Minute # Hour from 0 to 23, Minute from 0 to 59 # default value is 00:00 # since V4.00 access_log_rotate_time=00:00 # if rotate the error log every day # default value is false # since V4.02 rotate_error_log = false # rotate error log time base, time format: Hour:Minute # Hour from 0 to 23, Minute from 0 to 59 # default value is 00:00 # since V4.02 error_log_rotate_time=00:00 # rotate access log when the log file exceeds this size # 0 means never rotates log file by log file size # default value is 0 # since V4.02 rotate_access_log_size = 0 # rotate error log when the log file exceeds this size # 0 means never rotates log file by log file size # default value is 0 # since V4.02 rotate_error_log_size = 0 # keep days of the log files # 0 means do not delete old log files # default value is 0 log_file_keep_days = 0 # if skip the invalid record when sync file # default value is false # since V4.02 file_sync_skip_invalid_record=false # if use connection pool # default value is false # since V4.05 use_connection_pool = false # connections whose the idle time exceeds this time will be closed # unit: second # default value is 3600 # since V4.05 connection_pool_max_idle_time = 3600 # use the ip address of this storage server if domain_name is empty, # else this domain name will ocur in the url redirected by the tracker server http.domain_name= # the port of the web server on this storage server http.server_port=8888

   2、启动 Storage

    启动Storage前确保Tracker是启动的。

  

     初次启动 Storage 会在刚才设置的目录下创建 data 、logs 两个目录。

# 可以用这种方式启动
/etc/init.d/fdfs_storaged start

# 第二种
service fdfs_storaged start

  

   3、关闭 Storage 命令

service fdfs_storaged stop

  4、查看 Storage 和 Tracker 是否在通信

/usr/bin/fdfs_monitor /etc/fdfs/storage.conf

  

   5、设置 Storage 开机启动

chkconfig fdfs_storage on

    或者修改配置文件

vim /etc/rc.d/rc.local

# 加入配置
/etc/init.d/fdfs_storaged start

   6、Storage 目录

  

   7、文件上传测试

  修改 Tracker 服务器的客户端配置文件,并且创建

  

# connect timeout in seconds
# default value is 30s
connect_timeout=30

# network timeout in seconds
# default value is 30s
network_timeout=60

# the base path to store log files
base_path=/usr/local/fastdfs/client

# tracker_server can ocur more than once, and tracker_server format is
#  "host:port", host can be hostname or ip address
tracker_server=192.168.0.44:22122

#standard log level as syslog, case insensitive, value list:
### emerg for emergency
### alert
### crit for critical
### error
### warn for warning
### notice
### info
### debug
log_level=info

# if use connection pool
# default value is false
# since V4.05
use_connection_pool = false

# connections whose the idle time exceeds this time will be closed
# unit: second
# default value is 3600
# since V4.05
connection_pool_max_idle_time = 3600

# if load FastDFS parameters from tracker server
# since V4.05
# default value is false
load_fdfs_parameters_from_tracker=false

# if use storage ID instead of IP address
# same as tracker.conf
# valid only when load_fdfs_parameters_from_tracker is false
# default value is false
# since V4.05
use_storage_id = false

# specify storage ids filename, can use relative or absolute path
# same as tracker.conf
# valid only when load_fdfs_parameters_from_tracker is false
# since V4.05
storage_ids_filename = storage_ids.conf


#HTTP settings
http.tracker_server_port=80

#use "#include" directive to include HTTP other settiongs
##include http.conf

  创建文件夹

mkdir /usr/local/fastdfs/client

  随便照一张图片下载下来,并重命名

  

# 使用命令,前面脚本,客户端,要上传的文件
/usr/bin/fdfs_upload_file /etc/fdfs/client.conf test.jpg

   

   上传成功后,会返回图片路径

  

 

posted @ 2021-09-07 23:33  放手解脱  阅读(227)  评论(0编辑  收藏  举报