Pandas DataFrame to MySQL Database using iris Data

Page content

개요

  • 이전 강의에 이어서 진행한다. (MySQL Select Clause via Python)
  • 임의의 Pandas 데이터 프레임에서 MySQL DB로 추가하는 코드를 작성한다.

주요 라이브러리 설치

  • 아래와 같이 주요 라이브러리를 설치한다.
    • MySQL과 관련된 주요 Python 라이브러리를 설치한다.
pip install mysql-connector mysql-connector-python pymysql SQLAlchemy seaborn pandas

코드 작성(mysql-connector)

  • 아래와 같이 코드를 작성한다.
# 파일명 : db.py
import mysql.connector
import pandas as pd
import seaborn as sns

mydb = mysql.connector.connect(
    host = "localhost", 
    user = "root", 
    passwd = "evan",
    database = "muldb"
)

print(mydb)

iris_df = sns.load_dataset('iris')

my_cursor = mydb.cursor()

create_table_query  = """
   CREATE TABLE IF NOT EXISTS iris (
    id INT AUTO_INCREMENT PRIMARY KEY,
    sepal_length FLOAT,
    sepal_width FLOAT,
    petal_length FLOAT,
    petal_width FLOAT,
    species VARCHAR(255)
    ); 
"""

my_cursor.execute(create_table_query)

insert_query = """
    INSERT INTO iris (sepal_length, sepal_width, petal_length, petal_width, species) 
    VALUES (%s, %s, %s, %s, %s)
"""

data = [tuple(x) for x in iris_df.to_numpy()]

for row in data:
    my_cursor.execute(insert_query, row)

# DB Commit
mydb.commit()

# 종료
my_cursor.close()
mydb.close()
  • 이번에는 Python 파일을 실행한다.
$ python db.py 
<mysql.connector.connection_cext.CMySQLConnection object at 0x0000025FFC155BD0>
  • 이번에는 MySQL Workbench에서 확인한다.

Untitled

코드 작성(PyMySQL)

  • 아래와 같이 코드를 작성한다.
    • 기존 테이블과 구별하기 위해 iris2로 테이블을 생성했다.
# 파일명 : db.py
import pymysql
import pandas as pd
import seaborn as sns

mydb = pymysql.connect(
    host = "localhost", 
    user = "root", 
    passwd = "evan",
    database = "muldb"
)

print(mydb)

iris_df = sns.load_dataset('iris')

my_cursor = mydb.cursor()

create_table_query  = """
   CREATE TABLE IF NOT EXISTS iris2 (
    id INT AUTO_INCREMENT PRIMARY KEY,
    sepal_length FLOAT,
    sepal_width FLOAT,
    petal_length FLOAT,
    petal_width FLOAT,
    species VARCHAR(255)
    ); 
"""

my_cursor.execute(create_table_query)

insert_query = """
    INSERT INTO iris2 (sepal_length, sepal_width, petal_length, petal_width, species) 
    VALUES (%s, %s, %s, %s, %s)
"""

data = [tuple(x) for x in iris_df.to_numpy()]

for row in data:
    my_cursor.execute(insert_query, row)

# DB Commit
mydb.commit()

# 종료
my_cursor.close()
mydb.close()
  • 이번에는 Python 파일을 실행한다.
$ python db.py 
<pymysql.connections.Connection object at 0x000001FCC3529250>
  • 이번에는 MySQL Workbench에서 확인한다.

Untitled

코드 작성(sqlalchemy)

  • 아래와 같이 코드를 작성한다.
    • 기존 테이블과 구별하기 위해 iris3로 테이블을 생성했다.
# 파일명 : db.py
import pandas as pd
import seaborn as sns
import pymysql
from sqlalchemy import create_engine, Column, Integer, String, Float, Sequence
from sqlalchemy.ext.declarative import declarative_base
from sqlalchemy.orm import sessionmaker
from sqlalchemy import create_engine

iris_df = sns.load_dataset('iris')

database_username = 'root'
database_password = 'evan'
database_ip       = 'localhost'
database_name     = 'muldb'
database_connection = create_engine(f'mysql+pymysql://{database_username}:{database_password}@{database_ip}/{database_name}',  echo=True)
Session = sessionmaker(bind=database_connection)
session = Session()

Base = declarative_base()

# 테이블 생성
class Iris(Base):
    __tablename__ = 'iris3'
    id = Column(Integer, autoincrement=True, primary_key=True)
    sepal_length = Column(Float)
    sepal_width = Column(Float)
    petal_length = Column(Float)
    petal_width = Column(Float)
    species = Column(String(50))

Base.metadata.create_all(database_connection)

iris_df.to_sql(con=database_connection, name='iris3', if_exists='replace', index=False)

print("완료")
  • 이번에는 Python 파일을 실행한다.
$ python db.py 
C:\Users\user\OneDrive\Desktop\mysql_conn\db.py:20: MovedIn20Warning: The ``declarative_base()`` function is now available as sqlalchemy.orm.declarative_base(). (deprecated since: 2.0) (Background on SQLAlchemy 2.0 at: https://sqlalche.me/e/b8d9)
  Base = declarative_base()
2024-04-05 16:34:08,303 INFO sqlalchemy.engine.Engine SELECT DATABASE()
2024-04-05 16:34:08,303 INFO sqlalchemy.engine.Engine [raw sql] {}
2024-04-05 16:34:08,304 INFO sqlalchemy.engine.Engine SELECT @@sql_mode
2024-04-05 16:34:08,304 INFO sqlalchemy.engine.Engine [raw sql] {}
2024-04-05 16:34:08,304 INFO sqlalchemy.engine.Engine SELECT @@lower_case_table_names
2024-04-05 16:34:08,304 INFO sqlalchemy.engine.Engine [raw sql] {}
2024-04-05 16:34:08,305 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-04-05 16:34:08,305 INFO sqlalchemy.engine.Engine DESCRIBE `muldb`.`iris3`
2024-04-05 16:34:08,305 INFO sqlalchemy.engine.Engine [raw sql] {}
2024-04-05 16:34:08,306 INFO sqlalchemy.engine.Engine
CREATE TABLE iris3 (
        id INTEGER NOT NULL AUTO_INCREMENT,
        sepal_length FLOAT,
        sepal_width FLOAT,
        petal_length FLOAT,
        petal_width FLOAT,
        species VARCHAR(50),
        PRIMARY KEY (id)
)

2024-04-05 16:34:08,307 INFO sqlalchemy.engine.Engine [no key 0.00054s] {}
2024-04-05 16:34:08,316 INFO sqlalchemy.engine.Engine COMMIT
2024-04-05 16:34:08,318 INFO sqlalchemy.engine.Engine BEGIN (implicit)
2024-04-05 16:34:08,319 INFO sqlalchemy.engine.Engine DESCRIBE `muldb`.`iris3`
2024-04-05 16:34:08,319 INFO sqlalchemy.engine.Engine [raw sql] {}
2024-04-05 16:34:08,320 INFO sqlalchemy.engine.Engine DESCRIBE `muldb`.`iris3`
2024-04-05 16:34:08,320 INFO sqlalchemy.engine.Engine [raw sql] {}
2024-04-05 16:34:08,321 INFO sqlalchemy.engine.Engine SHOW FULL TABLES FROM `muldb`
2024-04-05 16:34:08,321 INFO sqlalchemy.engine.Engine [raw sql] {}
2024-04-05 16:34:08,322 INFO sqlalchemy.engine.Engine SHOW FULL TABLES FROM `muldb`
2024-04-05 16:34:08,322 INFO sqlalchemy.engine.Engine [raw sql] {}
2024-04-05 16:34:08,327 INFO sqlalchemy.engine.Engine SHOW CREATE TABLE `iris3`
2024-04-05 16:34:08,327 INFO sqlalchemy.engine.Engine [raw sql] {}
2024-04-05 16:34:08,329 INFO sqlalchemy.engine.Engine
DROP TABLE iris3
2024-04-05 16:34:08,329 INFO sqlalchemy.engine.Engine [no key 0.00023s] {}
2024-04-05 16:34:08,335 INFO sqlalchemy.engine.Engine 
CREATE TABLE iris3 (
        sepal_length FLOAT(53),
        sepal_width FLOAT(53),
        petal_length FLOAT(53),
        petal_width FLOAT(53),
        species TEXT
)

2024-04-05 16:34:08,335 INFO sqlalchemy.engine.Engine [no key 0.00036s] {}
2024-04-05 16:34:08,343 INFO sqlalchemy.engine.Engine INSERT INTO iris3 (sepal_length, sepal_width, petal_length, petal_width, species) VALUES (%(sepal_length)s, %(sepal_width)s, %(petal_length)s, %(petal_width)s, %(species)s)
2024-04-05 16:34:08,343 INFO sqlalchemy.engine.Engine [generated in 0.00044s] [{'sepal_length': 5.1, 'sepal_width': 3.5, 'petal_length': 1.4, 'petal_width': 0.2, 'species': 'setosa'}, {'sepal_length': 4.9, 'sepal_width': 3.0, 'petal_length': 1.4, 'petal_width': 0.2, 'species': 'setosa'}, {'sepal_length': 4.7, 'sepal_width': 3.2, 'petal_length': 1.3, 'petal_width': 0.2, 'species': 'setosa'}, {'sepal_length': 4.6, 'sepal_width': 3.1, 'petal_length': 1.5, 'petal_width': 0.2, 'species': 'setosa'}, {'sepal_length': 5.0, 'sepal_width': 3.6, 'petal_length': 1.4, 'petal_width': 0.2, 'species': 'setosa'}, {'sepal_length': 5.4, 'sepal_width': 3.9, 'petal_length': 1.7, 'petal_width': 0.4, 'species': 'setosa'}, {'sepal_length': 4.6, 'sepal_width': 3.4, 'petal_length': 1.4, 'petal_width': 0.3, 'species': 'setosa'}, {'sepal_length': 5.0, 'sepal_width': 3.4, 'petal_length': 1.5, 'petal_width': 0.2, 'species': 'setosa'}  ... displaying 10 of 150 total bound parameter sets ...  {'sepal_length': 6.2, 'sepal_width': 3.4, 'petal_length': 5.4, 'petal_width': 2.3, 'species': 'virginica'}, {'sepal_length': 5.9, 'sepal_width': 3.0, 'petal_length': 5.1, 'petal_width': 1.8, 'species': 'virginica'}]
2024-04-05 16:34:08,346 INFO sqlalchemy.engine.Engine COMMIT
완료
  • 이번에는 MySQL Workbench에서 확인한다.

Untitled

결론

  • 약간의 차이가 존재한다. 따라서 각 상황에 맞는 것을 골라서 잘 활용하면 될 것 같다.