gitmyhub

God-Of-BigData

★ 10k updated 2y ago

专注大数据学习面试,大数据成神之路开启。Flink/Spark/Hadoop/Hbase/Hive...

A Chinese-language study guide for big data engineering job interviews, covering Java foundations, distributed systems, and major frameworks like Hadoop, Spark, Flink, Kafka, and HBase through articles and linked resources.

JavaHadoopSparkFlinkKafkaHBaseHiveZookeepersetup: easycomplexity 1/5

This repository is a Chinese-language study guide for people who want to work professionally with big data technologies, particularly those preparing for technical job interviews in that space. The project description translates roughly to "focused on big data learning and interviews, the road to becoming a big data master." All the content, links, and navigation are written in Chinese.

The guide is organized into four broad sections. The first covers the programming and infrastructure foundations that a big data engineer needs before touching the specialized frameworks: Java fundamentals, concurrent programming, JVM internals, distributed systems theory, a coordination service called Zookeeper, remote procedure calls, the Netty network library, and Linux basics. Each topic links out to a series of articles, mostly hosted on CSDN (a major Chinese developer blogging platform) or in markdown files inside the repo itself.

The second section covers the big data frameworks directly: Hadoop (for storing and processing very large datasets across many machines), Hive (for querying that data using SQL-like syntax), Spark and Flink (two different engines for processing data quickly, including data that is arriving in real time), HBase (a database designed for very fast lookups across huge tables), and Kafka (a system for moving streams of data between services reliably). Each framework gets its own collection of articles covering how it works, how to configure it, and common problems.

The third section focuses on practical, hands-on articles the author published across Flink, Spark, Kafka, and OLAP (analytics database) topics. The fourth section is interview preparation: question sets and algorithm topics specifically aimed at big data engineering roles.

The repository also links to a WeChat public account and a Bilibili video channel where the author publishes additional material. It is primarily a reading and reference resource, not a runnable codebase. The full README is longer than what was shown.

Where it fits