原始数据帧包含数字中的点,例如:3.200.000。在本例中,点表示千个分隔符,而不是逗号,我尝试使用以下代码删除千个分隔符号:
pattern_shareholding_numbers = re.compile(r'[\d.]*\d+') shareholding_percentage_df = df[(~df["Jumlah Lembar Saham"].str.startswith("Saham") & (df["Jabatan"] == "-"))] shareholding_percentage_df = df[(~df["Jumlah Lembar Saham"].str.startswith("Jumlah Lembar Saham") & (df["Jabatan"] == "-"))] shareholding_percentage_df.reset_index(drop=True, inplace=True) shareholding_percentage_list = df["Jumlah Lembar Saham"].to_list() shareholding_percentage_string = ' '.join(shareholding_percentage_list) matches = pattern_shareholding_numbers.findall(shareholding_percentage_string) matches_dot_removed = [] for dot in matches: dot_removed = [] for e in dot: e = e.replace('.', '') e = e.replace('.', '') dot_removed.append(e) matches_dot_removed.append(dot_removed) shareholding_percentage_float = str(matches_dot_removed).rstrip('') print(shareholding_percentage_float)
上面的代码成功地替换了千位分隔符,现在返回如下内容:
[['3', '', '2', '0', '0', '', '0', '0', '0'], ['2', '', '9', '0', '0', '', '0', '0', '0'], ['2', '', '9', '0', '0', '', '0', '0', '0'], ['1', '', '0', '0', '0', '', '0', '0', '0']]
我正试图找到一种方法来删除间距,并将数字挤在一起,使其看起来像:
['3200000'], ['2900000'], ['2900000'], ['1000000']
有没有这样做的方法?