摘要:Share interest, spread happiness, increase knowledge, leave a beautiful!
分享兴趣,传播快乐,增长见闻,留下美好!
亲爱的您,这里是LearningYard新学苑。
今天小编为大家带来文章
“小h漫谈(11):大数据处理架构”
欢迎您的访问。
Share interest, spread happiness, increase knowledge, leave a beautiful!
Dear, this is LearningYard Academy.
Today Xiaobian brings you an article
"Chat with Little H (11): Big Data Processing Architecture"
Welcome to your visit.
一、思维导图(Mind mapping)
二、精读内容(Intensive Reading Content)
Hadoop 是一个开源的、可运行于大规模集群上的分布式计算平台,它实现了 MapReduce 计算模型和分布式文件系统 HDFS 等功能,在业内得到了广泛的应用,同时也成为大数据的代名词。
Hadoop is an open-source distributed computing platform that can run on large-scale clusters. It implements functions such as the MapReduce computing model and the distributed file system HDFS, and has been widely used in the industry. Meanwhile, it has also become synonymous with big data.
借助于 Hadoop,程序员可以轻松地编写分布式并行程序,并将其运行于计算机集群上,完成海量数据的存储与处理分析。
With the help of Hadoop, programmers can easily write distributed parallel programs and run them on computer clusters to complete the storage, processing and analysis of massive amounts of data.
Hadoop 是一个能够对大量数据进行分布式处理的软件框架,并且是以一种可靠、高效、可伸缩的方式进行处理的,它具有以下几个方面的特性:
Hadoop is a software framework capable of distributed processing of large amounts of data, and it conducts processing in a reliable, efficient and scalable manner. It has the following characteristics:
高可靠性。采用冗余数据存储方式,即使一个副本发生故障,其他副本也可以保证正常对外提供服务。
High reliability. It adopts a redundant data storage method. Even if one copy fails, other copies can ensure normal external services.
高效性。作为并行分布式计算平台,Hadoop 采用分布式存储和分布式处理两大核心技术,能够高效地处理PB 级数据。
High efficiency. As a parallel distributed computing platform, Hadoop employs two core technologies, namely distributed storage and distributed processing, and can efficiently handle petabyte-level data.
高可扩展性。Hadoop 的设计目标是可以高效稳定地运行在廉价的计算机集群上,可以扩展到数以千计的计算机节点上。
High scalability. The design goal of Hadoop is to be able to run efficiently and stably on inexpensive computer clusters and can be expanded to thousands of computer nodes.
高容错性。采用冗余数据存储方式,自动保存数据的多个副本,并且能够自动将失败的任务进行重新分配。
High fault tolerance. It adopts a redundant data storage method, automatically saves multiple copies of data, and can automatically redistribute failed tasks.
成本低。Hadoop 采用廉价的计算机集群,成本比较低,普通用户也很容易用自己的PC搭建 Hadoop 运行环境。
Low cost. Hadoop uses inexpensive computer clusters, so the cost is relatively low. Ordinary users can easily build a Hadoop running environment with their own PCs.
运行在Linux 操作系统上。Hadoop 是基于 Java 开发的,可以较好地运行在 Linux操作系统上。
It runs on the Linux operating system. Hadoop is developed based on Java and can run well on the Linux operating system.
支持多种编程语言。Hadoop 上的应用程序也可以使用其他语言编写,如 C++。
It supports multiple programming languages. Applications on Hadoop can also be written in other languages, such as C++.
今天的分享就到这里了
如果您对今天的文章有独特的想法
欢迎给我们留言
让我们相约明天
祝您今天过得开心快乐!
That's all for today's sharing.
If you have a unique idea about the article
please leave us a message
and let us meet tomorrow
I wish you a nice day!
文案|小h
排版|小h
审核|Dongyang
参考资料:
文字:《大数据技术原理与应用》
翻译:Kimi.ai
本文由LearningYard新学苑整理并发出,如有侵权请在后台留言!
来源:LearningYard学苑