Pandas: Loading Data & EDA

  1. Loading and Writting Data

    Question 1 of 2

      On Colab: Load schools.csv into a DataFrame and
      1. Display first 8, last 5 and sample 10 records.
      2. Create manhattan_df that consists of record from Manhattan only.
      3. Save manhattan_df into manhattan_schools.psv with pipe delimeter.
      4. Save manhattan_df into manhattan_schools.xlsx without index.
      5. Save manhattan_df into SQLlite database with table name manhattan_schools.
    • Load table actors table from chinook.db. Save it as excel file actors.xlsx.
    • Load actors.xlsx file generated above. Save it as actors.tsv file with tab separator.
  2. EDA Report

    Question 2 of 2

      On Colab: Load heart_disease_raw.csv into a DataFrame and
      1. Display sample 10 records.
      2. Display column names present in data.
      3. Display the number of rows and columns present in file.
      4. Show datatype of each columns to check if they align with data.
      5. Display summary stat of numeric columns with .describe.
      6. Display not-null data distribution with .info.
      7. Use .value_counts to see distribution of education column.
      8. Display number of null value associated with each column.
      9. Generate data profiling report using ydata-profiling and navigate result.