We're passing pandas When reading data , There are many reading methods you can choose , Here are some tips .
import pandas as pd
df = pd.read_csv('../data/XXX.csv',sep='\t',nrows = 100)
1.1 When setting the path , Pay attention to the direction of the slash , At the same time, it is recommended to write as follows
df = pd.read_csv(r'C:\Users\12810\Desktop\temp\XXX.csv',sep='\t',nrows = 100) # The previous regular ‘r’, Do not change the slash direction
1.2 Set a read only 100 Samples , It is of great help to speed up the operation when writing code initially .
df['new_col'] = XX # Additional data
df.apply(lambda x: x.split(','))
# more zip()、map()、lambda() For continuous use, please refer to :
# https://blog.csdn.net/HG0724/article/details/117374802
According to custom , Sometimes a single independent underscore is used as a name , To indicate that a variable is temporary or irrelevant .·
for _ in range(10):
print(' I am a :',_)
I am a : 0
I am a : 1
I am a : 2
I am a : 3
I am a : 4
I am a : 5
I am a : 6
I am a : 7
I am a : 8
I am a : 9
df['label'].value_counts().plot(kind='bar')
Used in data analysis , Find the data in a column , What is the number of elements , And visualize , Very practical .
from collections import Counter # A counter module
Reference resources :https://blog.csdn.net/ch_improve/article/details/89388389
# Count the number of characters
str_1 = 'wdqdqwdqwqwd11dq2wd'
count_result = Counter(str_1) #
print(count_result)
Counter({'d': 6, 'w': 5, 'q': 5, '1': 2, '2': 1})
count_result.most_common(3)
[('d', 6), ('w', 5), ('q', 5)]