Intermediate Big Data
Upcoming cohorts
According to recent surveys, there are approximately 200 millions active websites in the internet. That is why there is abundance of available data, what most are lacking instead, are the means to extract, process and analyze these data. In this course, we cover three crucial topics for any data task: Data extraction, Data storage and Data processing. Both JavaScript and Python libraries are utilized in the course to tackle different problems in the entire data processing pipeline.
This course is ideal for students who are seeking next level of understanding after the introductory level of programming.
Students attending this course will learn the following:
- Scraping and extracting data from public webpage using powerful Web-scraping tools
- Storing and consolidating data in cloud NoSQL database
- Learning fundamental Python knowledge to work with data
- Utilizing basic Python Data Science libraries to analyze and visualize unstructed data
Curriculum
Web Scraping
Web Scraping is an essential component of modern data science field. It is well-known that most of the useful data are only reachable in the form of a webpage but not a well-defined data API. In this section , we will guide students to use Puppeteer, the powerful web-scraping tools, to extract information from popular websites including even Single page application. Students will also learn how to work with the most popular programming environment Node.JS.
It cover in-depth knowledge of the following:
- Node Environment
- Node Packages
- Puppeteer
- Case Studies with real examples
NoSQL Database
Firebase is a NoSQL document-oriented cloud database that allows coders to store the data in a scalable and stable platform with minimal configuration. It is known for its ease-of-use in particular in the fields of unstructured data.
It covers in-depth knowledge of the following:
- Firebase
- Accessing firebase with Node
- Storing scraped data to Firebase
Basic Python
Python is the most popular programming language for working with data science. In this section, students are going to learn how to setup the environment and development tools to start their journey in Python and Data Science. Students are also going to learn the basic and advanced python such that they are able to work with the libraries in Python afterwards.
It covers in-depth knowledge of the following:
- Python environment setup
- Python Development tools
- Basic and Advanced Python
Basic Data Science
There are lots of data science libraries to make working with data an easier job. With the help of these libraries, students can easily extract, cleanse and visualise the data stored in the cloud databases. Students will learn how to post-process, analyze and present the data utilizing the libraries in this section.
This course covers the in-depth knowledge of the following:
- Numpy - Matrix and Tensor manipulation tools
- Pandas - Multi-format data processing tools
- Seaborn - Statistical visualisation tools
- Matplotlib - 2D plotting graph
Instructor team
Alex Lau
Co-Founder
Alex is an award-winning IT professional with extensive experience in software development, project management, and technology solutions. He is proficient in programming languages such as C, C#, JavaScript, TypeScript and Python. Alex has successfully trained over 700 students to become software developers and overseen corporate training programs for companies such as Swire Coca Cola and HKTDC. Alex has been recognized with multiple honors, including the ICT Grand Award, HSBC Youth Business Award, and the Esperanza Reimagine Education Challenge Award. He is also an AWS Community Builder and a Certified AWS Solution Architect Professional. Overall, Alex's passion for learning, teaching, and programming drives him to raise the standard and competitiveness of the IT coaching industry.
Gordon Lau
Co-Founder
Gordon's experience includes multiple roles of software development and leadership for different companies across various industries. With over 10 years' experience in professional programming and 4 years' experience in technology education, Gordon has mentored over 300+ newcomers to break into the technology sector. He was also the developer of the chatroom application HKGChat which acquired over 3,000 users on its launch day. He is additionally the principal developer of Tecky Code, which is Hong Kong's first open to public programming learning platform. As a firm believer in developing the future of Hong Kongβs IT industry, he has been promoting the importance of programming in mainstream education. Gordon is also an avid foreign language, science and travel enthusiast.
- Next start date
- Time 19:00 - 21:30 Every Tuesday and Thursday
- Duration 6 weeks
- Class size
- Location The Wave Mongkok