TP INSEE

Configuration de l’Environnement

Docker MySQL :

I used a MySQL docker for my database. Its simple and quick to install.

VSCode :

I chose VSCode as my IDE for its versatility and compatibility with database extensions. I installed the MySQL database management extension directly from VSCode.

Création de l’Utilisateur et de la table :

In MySQL, I created a specific user named “insee_app.” Afterward, I created the “communes” table and added appropriate privileges to access the table, which I would use to store the INSEE data.

Ansible :

I used Ansible to automate the execution of the Python script on the remote server. This allowed me to pass the database credentials securely and execute the script without manual intervention.

docker compose

version: '3'
services:
  mysql:
    image: mysql:latest
    container_name: welpdb
    environment:
      MYSQL_ROOT_PASSWORD: <root_password>
      MYSQL_DATABASE: <db_name>
      MYSQL_USER: <user_name>
      MYSQL_PASSWORD: <password>
    ports:
      - "3306:3306"
    volumes:
      - /mnt/storage/mysql_data_welpdb:/var/lib/mysql

Table Creation

CREATE TABLE communes2 (
    Code_commune_INSEE INT(10) NOT NULL,
    Nom_de_la_commune VARCHAR(255) NOT NULL,
    Code_postal INT(10) NOT NULL,
    PRIMARY KEY (Code_commune_INSEE)
);

Creation User

CREATE USER 'insee_app'@'192.168.1.34' IDENTIFIED BY 'insee';

User Acces

GRANT USAGE,SELECT,INSERT,DELETE,UPDATE ON communes TO 'insee_app'@'192.168.1.34';

Requirements

mysql-connector-python==8.0.23

sudo apt install mysql-client-core-8.0

Ansible Playbook

#tp_insee/main.yml
---
- name: Exécuter un script sur un serveur distant
  hosts: all
  become: yes

  tasks:
    - name: Exécuter le script Python avec les variables
      command: python3 /scripts/insert_data_insee.py --host '{{ mysql_host }}' --user '{{ mysql_user }}' --port '{{ mysql_port|int }}' --password '{{ mysql_password }}' --database '{{ mysql_database }}'

Ansible Inventory

node1.welpnetwork.com mysql_host=192.168.1.34 mysql_user=insee_app mysql_port=3333 mysql_password=insee mysql_database=welpdb

Structure du projet Ansible

tp_insee/
├── roles/
		├── id_welpdb/
│   └── main.yml
└── main.yml
#!/usr/bin/env python3
import sys
import csv
import mysql.connector

host = sys.argv[sys.argv.index("--host") + 1]
user = sys.argv[sys.argv.index("--user") + 1]
port = int(sys.argv[sys.argv.index("--port") + 1])
password = sys.argv[sys.argv.index("--password") + 1]
database = sys.argv[sys.argv.index("--database") + 1]

# Database connection
conn = mysql.connector.connect(
    host=host,
    user=user,
    port=port,
    password=password,
    database=database
)
cursor = conn.cursor()

# CSV file
fichier_csv = 'Hexa Small.csv'

# Open the CSV file
with open(fichier_csv, 'r', newline='', encoding='iso-8859-1') as csvfile:
    csv_reader = csv.reader(csvfile, delimiter=';')  # CSV delimiter is ';'
    next(csv_reader)  # Skip the header row

    # Loop over the rows in the CSV file
    for row in csv_reader:    
        code_commune_INSEE = row[0]
        nom_de_la_commune = row[1]
        code_postal = row[2]

        # Insert data into the 'communes2' table
        cursor.execute('''
            INSERT INTO communes2 (Code_commune_INSEE, Nom_de_la_commune, Code_postal)
            VALUES (%s, %s, %s)
            ON DUPLICATE KEY UPDATE
            Nom_de_la_commune = VALUES(Nom_de_la_commune),
            Code_postal = VALUES(Code_postal)
        ''', (code_commune_INSEE, nom_de_la_commune, code_postal))

conn.commit()
conn.close()

Result

View

CREATE VIEW view_communes_du_nord
AS
SELECT * FROM communes2
WHERE LEFT(code_postal, 2) IN ('59', '62');

add colonne population

ALTER TABLE communes2
ADD COLUMN Population INT(255);

add population to table with API

cat insert_population.py
#!/usr/bin/env python3

import requests
import mysql.connector
import sys

host = sys.argv[sys.argv.index("--host") + 1]
user = sys.argv[sys.argv.index("--user") + 1]
port = int(sys.argv[sys.argv.index("--port") + 1])
password = sys.argv[sys.argv.index("--password") + 1]
database = sys.argv[sys.argv.index("--database") + 1]

# Database connection
conn = mysql.connector.connect(
    host=host,
    user=user,
    port=port,
    password=password,
    database=database
)
cursor = conn.cursor()

# API URL
api_url = "https://geo.api.gouv.fr/communes/{}"

# Get data from the database
cursor.execute("SELECT Code_commune_INSEE FROM communes2 LIMIT 100")
communes = cursor.fetchall()

# Loop over the communes
for commune in communes:
    code_commune_insee = commune[0]

    # Call the API to get population data
    response = requests.get(api_url.format(code_commune_insee))

    # Process the JSON response
    if response.status_code == 200:
        commune_data = response.json()

        # Get the population data
        population = commune_data.get("population", None)

        # Update the database
        if population is not None:
            # Update the 'Population' column in the 'communes2' table
            update_query = "UPDATE communes2 SET Population = %s WHERE Code_commune_INSEE = %s"
            cursor.execute(update_query, (population, code_commune_insee))
            conn.commit()

# Close the connection
cursor.close()
conn.close()

TP INSEE 2

Technology used

Docker Ansible

https://hub.docker.com/r/semaphoreui/semaphore

Ansible Semaphore is a modern UI for Ansible. It lets you easily run Ansible playbooks, get notifications about fails, control access to deployment system.

Docker MySQL

I used a MySQL docker for my database. Its simple and quick to install.

You need to install mysql connecteur on the distant server.

sudo apt install python3-mysql.connector

Docker Vault

Vault, developed by HashiCorp, is a secret management tool that allows storing, managing, and controlling access to tokens, passwords, certificates, API keys, and other secrets.

version: '3'

services:
  vault: 
    image: hashicorp/vault
    container_name: vault
    ports: 
      - "8200:8200"
    volumes:
      - ./vault:/vault/file
      - ./config:/vault/config
    restart: always

GitHub

I used GitHub to host my Ansible playbooks and enable a Docker container to execute these playbooks by fetching them from my GitHub repository

VIEW

We began by creating an SQL view named “view_communes_du_nord” by filtering data from the “communes2” table to display only the communes with postal codes starting with ‘59’ or ‘62,’ representing the Nord Pas de Calais region.

CREATE VIEW view_communes_du_nord
AS
SELECT * FROM communes2
WHERE LENGTH(code_postal) = 5 AND LEFT(code_postal, 2) IN ('59', '62');

ADD COLUMN POPULATION

Next, we extended the structure of the “communes2” table by adding a new column named “Population,” with an INTEGER data type. This column would be used to store the population data we would retrieve.

ALTER TABLE communes2
ADD COLUMN Population INT(255);

ADD POPULATION WITH API

We developed a Python script, “insert_population.py,” which was executed on a remote server using Ansible. This script iterated through the first 100 INSEE codes in the “communes2” table, made API calls with each code, and fetched population information for each commune.

import requests
import mysql.connector
import sys

host = sys.argv[sys.argv.index("--host") + 1]
user = sys.argv[sys.argv.index("--user") + 1]
port = int(sys.argv[sys.argv.index("--port") + 1])
password = sys.argv[sys.argv.index("--password") + 1]
database = sys.argv[sys.argv.index("--database") + 1]

# Database connection
conn = mysql.connector.connect(
    host=host,
    user=user,
    port=port,
    password=password,
    database=database
)
cursor = conn.cursor()

# API URL
api_url = "https://geo.api.gouv.fr/communes/{}"

# Get data from the database
cursor.execute("SELECT Code_commune_INSEE FROM communes2 LIMIT 100")
communes = cursor.fetchall()

# Loop over the communes
for commune in communes:
    code_commune_insee = commune[0]

    # Call the API to get population data
    params = {
        "code_commune_insee": code_commune_insee
    }
    response = requests.get(api_url.format(code_commune_insee))

    # Process the JSON response
    if response.status_code == 200:
        population_data = response.json()
        population = population_data.get("population", None)

        # Update the database
        if population is not None:
            update_query = "UPDATE communes2 SET Population = %s WHERE Code_commune_INSEE = %s"
            cursor.execute(update_query, (population, code_commune_insee))
            conn.commit()

# Close the connection
cursor.close()
conn.close()

Ansible vault

  • Create a vault file
ansible-vault create vault.yml
  • Put a password
  • list variable you need to crypt
ANSIBLE_TOKEN: hvs.yyKFrGYsiknu9Qs7XiJTPI2u

without the password the file look like this

$ANSIBLE_VAULT;1.1;AES256
63336439666535393162393064613833343266366439646430326361653038353464623466653236
3735363733326161643637393839623332613864343164350a336234613536646262376561386533
38313039303866396336633866616430316236316239633161313663636538326264386361613361
3638313635336436370a333463373036326236313531613832353762656137626134383334633434
37613465666663616439343836623532643835393231346261386232363830373137346563363030
3466353664373034333838333264633265333239313663623331

Vault Secret

{
  "mysql_database": "welpdb",
  "mysql_host": "192.168.1.27",
  "mysql_password": "insee",
  "mysql_port": "3333",
  "mysql_user": "insee_app"
}

Ansible playbook

The “insert_population.yml” Ansible playbook is a YAML file that defines a set of tasks and instructions for Ansible to execute on the target server(s) specified in the inventory. In this playbook, we have the following tasks.

---
- name: Exécuter un script sur un serveur distant
  hosts: node2
  become: yes
	vars_files:
    - "./vault.yml"

  tasks:
    - name: Copier le script
      copy:
        src: ./insert_population.py  
        dest: /scripts/insert_population.py  
      become: yes

    - name: Return secrets
      set_fact:
        vault_secrets: "{{ lookup('community.general.hashi_vault', 'secret=secret/data/database token={{ANSIBLE_TOKEN}} url=http://192.168.1.23:8200')}}"

    - name: Exécuter le script Python avec les variables de Vault
      command: >
        python3 /scripts/insert_population.py
        --host "{{ vault_secrets.data.mysql_host | default(vault_secrets.mysql_host) }}"
        --user "{{ vault_secrets.data.mysql_user | default(vault_secrets.mysql_user) }}"
        --port "{{ vault_secrets.data.mysql_port | default(vault_secrets.mysql_port) }}"
        --password "{{ vault_secrets.data.mysql_password | default(vault_secrets.mysql_password) }}"
        --database "{{ vault_secrets.data.mysql_database | default(vault_secrets.mysql_database) }}"        
   
    - name: Supprimer le script
      file: 
        path: /scripts/insert_population.py
        state: absent
      become: yes