Python df.to _messages는 숫자를 excel의 텍스트로 저장할 수 없습니다.가치로 저장하는 방법?

sourcetip

Python df.to _messages는 숫자를 excel의 텍스트로 저장할 수 없습니다.가치로 저장하는 방법?

fileupload 2023. 7. 7. 21:01

Python df.to _messages는 숫자를 excel의 텍스트로 저장할 수 없습니다.가치로 저장하는 방법?

나는 pd.read_html을 통해 구글 파이낸스에서 테이블 데이터를 스크랩한 다음 그 데이터를 저장하여 탁월하게 처리하고 있습니다.df.to_excel()아래와 같이:

    dfs = pd.read_html('https://www.google.com/finance?q=NASDAQ%3AGOOGL&fstype=ii&ei=9YBMWIiaLo29e83Rr9AM', flavor='html5lib')
    xlWriter = pd.ExcelWriter(output.xlsx, engine='xlsxwriter')

    for i, df in enumerate(dfs):
        df.to_excel(xlWriter, sheet_name='Sheet{}'.format(i))
    xlWriter.save()

그러나 excel로 저장된 숫자는 셀 모서리에 작은 녹색 삼각형이 있는 텍스트로 저장됩니다.이 데이터를 Excel로 이동할 때 텍스트가 아닌 실제 값으로 저장하려면 어떻게 해야 합니까?

데이터 프레임을 생성하거나 사용할 때 문자열 데이터가 숫자로 변환되는 다른 솔루션 외에도 다음과 같은 옵션을 사용하여 수행할 수 있습니다.xlsxwriter엔진:

# Versions of Pandas >= 1.3.0:
writer = pd.ExcelWriter('output.xlsx',
                        engine='xlsxwriter',
                        engine_kwargs={'options': {'strings_to_numbers': True}})

# Versions of Pandas < 1.3.0:
writer = pd.ExcelWriter('output.xlsx',
                        engine='xlsxwriter',
                        options={'strings_to_numbers': True})

문서에서:

strings_to_numbers사용 가능worksheet.write()문자열을 숫자로 변환하는 방법(가능한 경우), 사용float()텍스트로 저장된 숫자에 대한 Excel 경고를 피하기 위해.

숫자 열을 부동 소수점으로 변환하는 것을 고려해 보십시오.pd.read_html웹 데이터를 문자열 형식(즉, 개체)으로 읽습니다.그러나 플로트로 변환하기 전에 하이픈을 NaNs로 바꿔야 합니다.

import pandas as pd
import numpy as np

dfs = pd.read_html('https://www.google.com/finance?q=NASDAQ%3AGOOGL' +
                   '&fstype=ii&ei=9YBMWIiaLo29e83Rr9AM', flavor='html5lib')
xlWriter = pd.ExcelWriter('Output.xlsx', engine='xlsxwriter')
workbook = xlWriter.book

for i, df in enumerate(dfs):
    for col in df.columns[1:]:                  # UPDATE ONLY NUMERIC COLS 
        df.loc[df[col] == '-', col] = np.nan    # REPLACE HYPHEN WITH NaNs
        df[col] = df[col].astype(float)         # CONVERT TO FLOAT   

    df.to_excel(xlWriter, sheet_name='Sheet{}'.format(i))

xlWriter.save()

이는 경고가 표시되는 열의 데이터 유형이 다음과 같기 때문일 수 있습니다.objects다음과 같은 숫자 유형이 아닙니다.int또는float.

데이터 프레임의 각 열의 데이터 유형을 확인하려면 다음과 같이 사용합니다.

print(df.dtypes)

제 경우, 숫자 값 대신 객체로 저장된 열은PRECO_ES

저의 특별한 경우, 십진수가 관련이 있기 때문에, 저는 , 를 사용하여 다음과 같이 float로 변환했습니다.

df['PRECO_ES'] = df['PRECO_ES'].astype(float)

데이터 유형을 다시 확인하면 다음과 같은 정보를 얻을 수 있습니다.

그런 다음 데이터 프레임을 Excel로 내보내기만 하면 됩니다.

#Export the DataFRame (df) to XLS
xlsFile = "Preco20102019.xls"
df.to_excel(xlsFile)

#Export the DataFRame (df) to CSV
csvFile = "Preco20102019.csv"
df.to_csv(csvFile)

그런 다음 Excel 파일을 열면 텍스트가 아닌 숫자로 값이 저장되므로 경고가 더 이상 표시되지 않음을 알 수 있습니다.

내보낼 열이 실제로 파이썬(int 또는 float)의 숫자인지 확인했습니까?

또는 = VALUE() 기능을 사용하여 텍스트 필드를 Excel의 숫자로 변환할 수 있습니다.

panda 0.19부터 pd.read_html에 na_values 인수를 제공하면 pd.read_html을 사용하여 가격 열에 부동액 유형을 자동으로 정확하게 추론할 수 있습니다.

다음과 같이 표시됩니다.

dfs = pd.read_html(
    'https://www.google.com/finance?q=NASDAQ%3AGOOGL&fstype=ii&ei=9YBMWIiaLo29e83Rr9AM',
    flavor='html5lib',
    index_col='\nIn Millions of USD (except for per share items)\n',
    na_values='-'
)

xlWriter = pd.ExcelWriter('Output.xlsx', engine='xlsxwriter')
for i, df in enumerate(dfs):
    df.to_excel(xlWriter, sheet_name='Sheet{}'.format(i))
xlWriter.save()

또는 (팬더 0.19가 아직 없다면) @Parfait 솔루션의 간단한 버전을 사용합니다.

dfs = pd.read_html(
    'https://www.google.com/finance?q=NASDAQ%3AGOOGL&fstype=ii&ei=9YBMWIiaLo29e83Rr9AM',
    flavor='html5lib',
    index_col='\nIn Millions of USD (except for per share items)\n'
)

xlWriter = pd.ExcelWriter('Output.xlsx', engine='xlsxwriter')
for i, df in enumerate(dfs):
    df.mask(df == '-').astype(float).to_excel(xlWriter, sheet_name='Sheet{}'.format(i))
xlWriter.save()

이 두 번째 솔루션은 인덱스 열(.read_html)을 올바르게 정의한 경우에만 작동하며, (데이터) 열 중 하나에 플로트로 변환할 수 없는 항목이 포함되어 있으면 ValueError와 함께 비참하게 실패합니다.

Excel 시트에 문자열 데이터 유형을 지정하려면 다음과 같이 하십시오.

for col in original_columns:
    df_employees[col] = df_employees[col].astype(pd.StringDtype())

언급URL : https://stackoverflow.com/questions/41080999/python-df-to-excel-storing-numbers-as-text-in-excel-how-to-store-as-value

'sourcetip' 카테고리의 다른 글

ASP.NET에서 모든 브라우저에 대해 브라우저 캐싱 사용 안 함 (0)	2023.07.07
mongo 데이터베이스 url에서 authSource는 무엇을 의미합니까? (0)	2023.07.07
새 액세스 토큰을 얻으려면 Refresh 토큰을 명시적으로 보내야 합니까? - JWT (0)	2023.07.07
WPackagist를 사용한 Composer 사용자 지정 설치 관리자 경로 (0)	2023.07.07
IntelliJ: 테스트에 대한 로그 수준을 설정하는 방법 (0)	2023.07.07

현재글Python df.to _messages는 숫자를 excel의 텍스트로 저장할 수 없습니다.가치로 저장하는 방법?

각종 프로그래밍 정보를 다루는 블로그입니다.

reactjs, css, JSON, MongoDB, sql-server, MariaDB, WordPress, Oracle, wpf, AngularJS, jquery, c, ajax, spring-boot, MySQL, powershell, GIT, Excel, python, ASP.NET,

Today :
Yesterday :

일	월	화	수	목	금	토
			1	2	3	4
5	6	7	8	9	10	11
12	13	14	15	16	17	18
19	20	21	22	23	24	25
26	27	28	29	30	31

fileupload

Python df.to _messages는 숫자를 excel의 텍스트로 저장할 수 없습니다.가치로 저장하는 방법?

Python df.to _messages는 숫자를 excel의 텍스트로 저장할 수 없습니다.가치로 저장하는 방법?

'sourcetip' 카테고리의 다른 글

'sourcetip'의 다른글

티스토리툴바

Python df.to _messages는 숫자를 excel의 텍스트로 저장할 수 없습니다.가치로 저장하는 방법?

Python df.to _messages는 숫자를 excel의 텍스트로 저장할 수 없습니다.가치로 저장하는 방법?

'sourcetip' 카테고리의 다른 글

'sourcetip'의 다른글

관련글

티스토리툴바