银行用户数据分析

数据预处理 获取数据 !wget https://raw.githubusercontent.com/Rosefinch-Midsummer/Rosefinch-Midsummer.github.io/main/content/posts/file/bankpep.csv 读入数据并以id为索引,展示前五个数据 1 2 3 4 import pandas as pd import matplotlib.pyplot as plt df = pd.read_csv('bankpep.csv',index_col='id') print(df.head(5)) 把字符型数据替换成数值型数据 1 2 3 4 5 6 7 8 9 10 11 12 13 seq = ['married' ,'car','save_act','current_act', 'mortgage' ,'pep'] for feature in seq: df.loc[df[feature]=='YES',feature] = 1 df.loc[df[feature]=='NO',feature] = 0 #替换性别 df.loc[df['sex']=='MALE','sex'] = 1 df.loc[df['sex']=='FEMALE','sex'] = 0 print(df[0:5]) 利用dummmies矩阵处理多个离散值的特征项如把children分成children1,children2,children3 ...

创建: 2023-05-21 | 更新: 2023-05-21 | 字数: 1857字 | 时长: 4分钟 | RM