Data Accelerator (Data Accelerator Goose File System, GooseFS) has the following clear advantages in Data lake scenarios:
Data I/O Performance
GooseFS deployment provides a distributed shared cache near the compute node. Upper-layer computing applications can transparently and efficiently cache frequently accessed hot data from remote storage to the near compute node, accelerating data I/O performance. GooseFS offers metadata caching functionality, which can enhance the performance of metadata operations such as querying file data and listing file lists in big data scenarios. When used in conjunction with big data buckets, it can further accelerate the performance of renaming files. Additionally, businesses can select different storage media such as MEM, SSD, NVME, and HDD as needed to balance business costs and data access performance.
Integrated Storage
GooseFS provides a unified namespace that supports the storage semantics of not only COS but also HDFS, K8S CSI, and FUSE. It offers an integrated storage solution for upper-layer businesses and simplifies business-side Ops configuration. Integrated storage eliminates barriers between different data bases, facilitating data management and transfer by upper-layer applications and improving data utilization efficiency.
Ecosystem Affinity
GooseFS is fully compatible with the Tencent Cloud Big Data Platform framework and supports customized local deployment on the customer side, featuring excellent ecosystem affinity. On the business side, GooseFS can not only be used to accelerate big data services in the Elastic MapReduce product on Tencent Cloud but can also be conveniently deployed locally on public cloud CVMs or self-built IDCs. Additionally, GooseFS supports transparent acceleration capabilities. For users already using Tencent Cloud COSN and CHDFS, only simple configuration modifications are required to automatically use GooseFS to accelerate business access to COSN and CHDFS without modifying any business code or access paths.