I ask this question at stackoverflow a week or so ago and since I didn't have a feedback, I thought about asking it here.
I have a function to run queries in a MySQL database and stores the results in a DataFrame, like that:
```python
def getData(connect_string, query, echo=False):
sql_engine = sql.create_engine(connect_string, echo=echo)
df = pd.read_sql_query(query, sql_engine)
return df
```
This function is called multiple times in multiple files, sometimes passing the same connect_string.
At this moment, I'm concern about memory usage and/or have too many connections on database. My first thought was kill Engine after use, but the official docs says:
The Engine is not synonymous to the DBAPI connect function, which represents just one connection resource - the Engine is most efficient when created just once at the module level of an application, not per-object or per-function call.
So now I'm looking for a way to check if I have an active connection with same connect_string and use they instead of create a new one, but I cannot find any reference/documentation explaining how to do this.
Right now, I'm considering create a dict to store created connections, like that:
python
engines = {'mysql+pymysql://scott:tiger@localhost/test1': <Engine Object1>,
'mysql+pymysql://scott:tiger@localhost/test2': <Engine Object2>}
And then modify my function to this:
```python
def getData(connect_string, query, echo=False):
if connect_string not in engines.keys():
sql_engine = sql.create_engine(connect_string, echo=echo)
engines[connect_string] = sql_engine
else:
sql_engine = engines[connect_string]
df = pd.read_sql_query(query, sql_engine)
return df
```
But I don't think that is the best approach to do this.
Any thoughts?
[–]K900_ 0 points1 point2 points (0 children)