Assignment Question
Addressing the Business Question (SQL Analysis):
Congrats on becoming a business analyst! Your database has been designed based on your requirements. Now it’s time to answer your business question: Does price positively effect customer satisfaction?
Analysis Requirements (Jupyter Notebook): Introduce the problem and define key terms 5-10 sentences At least one credible source for each key term defined Answer the business question 5-10 sentences Make sure your results are statistically significant Provide your top two actionable insights 5-10 sentences each. Provide at least one credible source per insight. Make sure to go beyond the numbers. Note that the company is likely to already be taking advantage of common metrics such as correlations and is expecting a deeper level of analysis. Use markdown to explain the rest of your analysis 250-450 words Remember that markdown is used to explain what you, the analyst, has found important through the code. Code comments are used to explain the technical aspects of the code. SQL Requirements Provide the SQL queries needed to: explore the data leading up to the creation of your final dataset develop your final dataset (this is what will be exported into Excel and then read into Python) Make sure to include a USE statement and ample comments throughout your code. Do not use AI to generate any of your SQL code. Python Requirements Your code must generate the following: Descriptive statistics Frequency tables Correlation 3-5 well-designed, highly relevant data visualizations (scatterplots, boxplots, etc.) Make sure to avoid data dumping: Remove any outputs/visuals that do not directly support your insights Limit your tabular outputs Do not use AI to generate any of your Python code. Tips To get your final dataset from SQL to Python, you may export the data from SQL into an Excel file and then imported into Python with pd.read_excel(). Avoid writing about what you did. Your stakeholders will assume that you took proper steps to analyze the data and do not have the bandwidth to read through your process. They are more interested in your answer to the business question, as well as your top two actionable insights. Note that your stakeholders will start asking questions about the validity of your results if your insights stray from the SQL queries/Python code you provide. Additional files (Excel, etc.) will not be assessed. Deliverables
1. Submit a Jupyter Notebook in the following two formats: Jupyter Notebook (.ipynb format) HTML page, converted directly from the Jupyter Notebook interface (.html format)
2. Submit your SQL queries in the following two formats: SQL script (.sql format) Text file (.txt format) Weighting This assignment is worth 60% of your total grade for this course.
Rubric Business Question and Writing Quality (25 points)
The business question is properly introduced in relation to a market opportunity or need. In other words, it is easy to understand the value generated from addressing the business question. Key terms (price and customer satisfaction) are well defined and supported by credible sources. Their definitions are intuitive for the problem at hand. The answer to the business question is supported by the data and statistical significance tests. Relevant numbers have been provided to support the answer to the business question. Word and sentence limits are respected throughout the analysis. Writing is of professional quality in all parts of the deliverable. Actionable Insights (25 points) Insights are highly actionable, providing practical information that can be applied to improve performance in terms of the business question. In other words, what should stakeholders do to take advantage of your answer to the business question? Each insight is supported by at least one credible source. Relevant numbers have been provided to support each actionable insight. Insights go beyond the numbers and explain why a finding is valuable. SQL Analysis and Dataset Generation (25 points) Data has been explored through SQL queries that are relevant to the analysis. SQL techniques (WHERE, GROUP BY, subqueries, etc.) are used appropriately. The final query in the SQL script generates a dataset to be imported into Python. No bugs or errors occur between the exported dataset and the imported Python file. Outputs are controlled with syntax such as LIMIT. Data dumping has been minimized. The code contains ample comments that are focused on technical aspects (minimum of one comment for every five lines of code). Python Analysis and Jupyter Notebook (25 points) Markdown is used throughout the analysis to explain results. Results for descriptive statistics, frequency tables, and correlation are well formatted and used appropriately. Data visualizations are meaningful in terms of actionable insights and/or addressing the business question. Key aspects of each data visualization are highly noticeable. Tabular outputs are controlled with methods such as .head(). Data dumping has been minimized. The code contains ample comments that are focused on technical aspects (minimum of one comment for every five lines of code).