XLSX¶
Import using Panda¶
You can the panda library function read_excel() to import Excel files into a DataFrame
Import Using openpyxl¶
To read Excel files (.xls format) in Python, you can use the openpyxl package:
import openpyxl
import pandas as pd
dataframe = openpyxl.load_workbook("./data/chapter5/titianic.xlsx")
sheet1 = dataframe['Sheet1']
rows_sheet1 = []
for row_sheet1 in sheet1.iter_rows(values_only=True):
rows_sheet1.append(row_sheet1)
df_sheet1 = pd.DataFrame(rows_sheet1[1:], columns=rows_sheet1[0])
sheet2 = dataframe['Sheet2']
rows_sheet2 = []
for row_sheet2 in sheet2.iter_rows(values_only=True):
rows_sheet2.append(row_sheet2)
df_sheet2 = pd.DataFrame(rows_sheet2[1:], columns=rows_sheet2[0])
Combining files¶
To combine .xlsx files, use the following code:
marketing_analyst_names = pd.read_excel("MarketingAnalystNames.xlsx")
sales_rep_names = pd.read_excel("SalesRepNames.xlsx")
senior_leadership_names = pd.read_excel("SeniorLeadershipNames.xlsx")
all_df_list = [marketing_analyst_names, sales_rep_names, senior_leadership_names]
appended_df = pd.concat(all_df_list)
Add index=False parameter to not include row numbers