Hi! I have never written articles, this will be my first time.

At my job, we use PostgreSQL for data storage and often need to write many small, simple queries for the analytics department. I wanted to try GPT-3 and see how it could help with this task. However, using the GPT-3 Playground is inconvenient for this task because you need to manually describe the structure of all the tables first.
So, I decided to write a small script that would convert the database structure into a text query that could be used with instructions to create real SQL queries. I chose to use Python for this project because I wanted to try a new programming language.
I figured this would be a good way to try out a new programming language and try to write something useful.
Here was the plan for solving the problem:
Writing in new languages thanks to Copilot (it based on GPT-3 Codex model) is a great experience, and I highly recommend trying it with any programming language you are new to.
For those who are unfamiliar with GPT-3: it is the most advanced neural network based on the GPT (Generative Pre-trained Transformer) principle. It can generate text, perform natural language processing tasks, solve image processing tasks, and more.
Andréj Karpathy talks about how GPT-3 works in this video: https://www.youtube.com/watch?v=kCc8FmEb1nY.
Also, it is worth mentioning that this is the same network that powers ChatGPT (https://chat.openai.com).
To connect to the database and retrieve the table structure, I used the psycopg2 library, which was recommended to me by GPT-3. Before we get started, let’s install all the necessary dependencies for the project:
pip install psycopg2 openai
Now, start with Schema class:
class Schema:
"""Generate SQL Schema from PostgreSQL"""
def __init__(self, schema = 'public'):
"""Connect to PostgreSQL database"""
self.schema = schema
try:
self.conn = psycopg2.connect(DATABASE_URL)
except psycopg2.OperationalError as err:
print(f'Unable to connect!
{err}')
sys.exit(1)
else:
print('Connected to PostgreSQL database successfully.')
self.cur = self.conn.cursor()
self.comments = []
self.tables = []
self.columns = []Next, we retrieve the list of tables and their comments. Since comments are stored in a separate location in PostgreSQL, we need to execute an additional query to get all comments for all tables.
def get_tables(self):
"""Get list of tables"""
self.cur.execute(
"SELECT table_name FROM information_schema.tables WHERE table_schema = %s",
(self.schema,)
)
tables = self.cur.fetchall()
self.tables = tables
return tables
def get_all_comments(self):
"""Get list of all comments"""
self.cur.execute('''
SELECT
c.table_schema,
c.table_name,
c.column_name,
pgd.description
FROM
pg_catalog.pg_statio_all_tables AS st
INNER JOIN
pg_catalog.pg_description pgd ON (pgd.objoid = st.relid)
INNER JOIN
information_schema.columns c ON (
pgd.objsubid = c.ordinal_position
AND c.table_schema = st.schemaname
AND c.table_name = st.relname
)
''')
comments = self.cur.fetchall()
self.comments = comments
return commentsNext, we obtain a list of fields and their data types for each table.
def get_columns(self, table):
"""Get list of columns for a table"""
self.cur.execute(
"""
SELECT column_name, data_type
FROM information_schema.columns
WHERE table_schema = %s AND table_name = %s
""",
(self.schema, table)
)
columns = self.cur.fetchall()
return columnsWe will generate the full structure of the base. In my repository on GitHub you can find the generation for the tables you need, but in this example I will simply generate the structure for all tables.
The less extra tables are used in generation, the easier it is to get the correct result and the cheaper the request to the OpenAI API will cost.
We compile all the information into a single string to create the text structure of the database:
def index(self):
"""Generate SQL Schema"""
prompt = ''
tables = self.get_tables()
comments = self.get_all_comments()
for table in tables:
columns = self.get_columns(table[0])
prompt += f'The "{table[0]}" table has columns: '
for column in columns:
cmnt = ''
for comment in comments:
if (comment[0] == self.schema and
comment[1] == table[0] and
comment[2] == column[0]):
cmnt = comment[3]
break
if cmnt == '':
prompt += f'{column[0]} ({column[1]}), '
else:
prompt += f'{column[0]} ({column[1]} - {cmnt}), '
prompt = prompt[:-2] + '. '
return prompt
We now have the text structure of the database. At this point, I created a simple UI using Vue and Bootstrap for ease of use with the API. You can see what it looks like in the first GIF at the beginning of the article.
The code for the UI is available in the GitHub repository.

For demonstration purposes, I will be using a simple text input, but you can use any other convenient method. We save the resulting class code in the schema.py file and then import it in the main.py file.
Remember to obtain an OpenAI API key to access the API. You can get one at: https://platform.openai.com/account/api-keys.
import openai from schema import schema # Our class to generate schemas openai.api_key = 'sk-......' # Your api key prompt = input('Enter prompt: ')Now we need to properly address GPT-3 and tell what we expect from it. Here you need to connect your creativity in making a request. I was inspired by some requests I found during the implementation on the seekwell.io blog.
My request ends up looking like this:
Given an input question, respond with syntactically correct PostgreSQL. Be creative but the SQL must be correct, not nessesary to use all tables. {sql_schema}\n\nInstructions: {prompt}\n\nSQL:
Where is:
For example, we get the following request:
Given an input question, respond with syntactically correct PostgreSQL. Be creative but the SQL must be correct, not necessary to use all tables. The "public"."users" table has columns: id (integer — user id) name (text — user name) email (text — user email) The "public"."posts" table has columns: id (integer — post id) title (text — post title) body (text — post body) user_id (integer — user id) The "public"."comments" table has columns: id (integer — comment id) body (text — comment body) post_id (integer — post id) user_id (integer — user id) Instructions: Give me all posts by user where email hosted on gmail.com SQL:
As a result from GPT-3, we can get the following answer (the example was generated by copilot at the time of writing the article, this is mind-blowing!):
SELECT * FROM posts WHERE user_id IN (SELECT id FROM users WHERE email LIKE ‘%@gmail.com’);
OK. But how to do it in code?
First, we need to set the correct set of parameters for the request in the GPT-3 API. We have these important parameters to pass into query:
Let’s now make a request to the GPT-3 API and get a response:
query_temperature = 0.5
final_prompt = f'''
Given an input question, respond with syntactically correct PostgreSQL.
Be creative but the SQL must be correct, not necessary to use all tables.
{sql_schema}
Instructions: {prompt}
SQL:
'''
gpt_response = openai.completion.create(
engine="text-davinci-003",
prompt=final_prompt,
temperature=float(query_temperature),
max_tokens=150,
stop=["
"]
)
print(f'GPT-3 response: {gpt_response["choices"][0]["text"]}')You can find the full implementation here: https://github.com/Hormold/gpt-sql-box/blob/master/cli.py
That’s all! Now we can get a response from the GPT-3 API and send it to PostgreSQL for execution.
Be careful because the GPT-3 API may return invalid SQL query, which can lead to data loss — carefully check the model responses before executing the query.
We can easily change the database to a SQL like database such as MySQL or MariaDB by simply changing the way the database schema is built and making small changes to the final_prompt.
For the convenient use of this code, I made a small project on GitHub with minimal functionality and interface for easy use: https://github.com/Hormold/gpt-sql-box

Here is Midjourney illustration for this small project
If you want to use the GPT-3 API in your projects, then you will need to register at https://platform.openai.com and get an API key.
I hope you enjoyed the article and will be able to use the GPT-3 API in your next projects.
If you have questions, then write to me on Telegram @define and I will try to help you.