🔬 What is Data Science?
Data science is a combination of mathematics, statistics, coding, technology, and creativity. It's as much a technical process as a creative one! Data science allows you to process large amounts of information and gain insight into what patterns the data present.
Data science:
- Is an evolving subject with no single definition
- Requires a range of skills (including coding, statistics, and domain knowledge)
- Takes unstructured data then finds order, meaning, and value
Formally, “data science is the exploration and quantitative analysis of all available structured and unstructured data to develop understanding, extract knowledge, and formulate actionable results”
Note: Structured data is quantitative, meaning that it has countable elements and can be organized in rows and columns. Unstructured data is qualitative, meaning that it has uncountable elements. Unstructured data can include audio and video files, images, or a free-response feedback form to name a few examples.
⭕️ Data Science Venn Diagram
What topics make up data science?

Data science includes coding, statistics, and domain expertise. Domain expertise can be in law, medicine, politics, or anything your data is relevant to. The intersection of these topics makes up data science!
How does AI fit into data science?
You may be wondering, how does AI fit into data science? AI and more specifically, machine learning, is an essential component of data science in the 21st century. Data scientists use statistical methods and tools to derive insights from data. Machine learning in data science goes one step further, using data to solve problems and “learn” in a way that mimics human cognition to predict future patterns.
Data scientists use machine learning models to identify patterns and make predictions or decisions based on data. While machine learning is an exciting and important part of data science, we won't spend much time on it in the next two weeks. This course is designed to help you master the basics of data science: how to work with data, clean it, analyze it, and share what you’ve discovered in clear, meaningful ways. These are the building blocks of data science, and they’ll set you up for success whether you decide to explore machine learning later or use your skills in other areas.
Our course will culminate with an exciting project where you’ll build an interactive data dashboard. This dashboard will showcase insights on a topic you care about, allowing you to tell a compelling story with data. By the end of these two weeks, you’ll have the tools to continue your data science journey with confidence and curiosity.
What does a Data Scientist do?
A data scientist's goal is to create value for their companies and organizations by extracting meaning from data. They do this by making hypotheses, data gathering, data cleaning (the process of looking for problems in a data set), analyzing data, and interpreting data.
Data scientists generally have an interest in data collection and analysis, enjoy problem-solving and communicating with others, and have a background in mathematics, statistics, or computer science. Data scientists can work for any type of organization they are passionate about as data skills can be applied to any subject, idea, or area of interest.
👋 Here are some profiles of data scientists and what they work on:
🎥 Click here to see a video describing what it is like to be a Data Scientist at Google.
Try-It: Find other profiles of data scientists that you would like to share with your camp!
🔓 Why Data Science?
- The field needs more data scientists like you! You bring your own important perspective on where analysis can be most beneficial and impactful.
- Data science incorporates a lot of different subjects and you can find what excites you!
- Data skills can be applied to any subject, idea, or area of interest you are passionate about!
Data Science Applications
- Health Care: Data science is used for medical image analysis, pharmaceutical research and development, and even improving medical facility operations.
- E-commerce: Data science has helped businesses identify target markets, optimize pricing, and employ recommendation systems.
- Transportation: From self-driving cars to analyzing driver behavior, car manufacturers are using data science to create smarter, safer vehicles.
💻 Programming Languages

The most frequently used programming languages for statistics are Python and R.
SQL is typically used with databases.
Data Science and KWK
Over the next two weeks, we will act as data scientists, using data to create data visualizations (graphical representations of information and data). Data scientists use data visualizations to help visualize and understand trends, outliers, and patterns in data. We will be using SQL and Tableau to build our data visualizations.
🧰 Data Science Process
- Define a problem
- Data scientists are always curious and ask a lot of questions, so the first step of any data science project is figuring out what problem you want to solve!
- Collect data
- Once we know our problem, we can go out into the world and collect the data.
- Prepare data
- Data does not normally arrive in the shape or form we want it to, so we have to spend time cleaning the data to ensure it works for us.
- Analyze data
- At this step, we can understand relationships in the data. We can understand what the data tells us about the problem and what features are important in our data.
- Visualize data
- We will use Tableau to visualize our data.
- Communicate insights
- We can now take our visualizations and communicate our insights to help stakeholders understand the problem and data.
Check out this Lego depiction of the Data Science Process!
Answer each of these questions in your journal:
- How would you explain data science in your own words?
- How do data scientists uncover insights from large amounts of data?
- Can you think of some other real world applications of data science?
