Recently, I have been involved in a survey project for Tokyo University. I’m in charge for making the back-end API for the application. To complete this project, I have been using Django, with the Django REST framework. The backend uses a PostgreSQL database with several tables.
Downloadable Tables
The client requested that I add a feature to allow administration users to download the data in CSV format. They also wanted me to include a feature where all of the data can be dumped into a zip file for download.
I will write about how I accomplished this using two different methods.
Downloading CSV data from individual tables
Since we were on a bit of a time crunch, I wanted to quickly fulfill the requirements. I needed a way to allow admin users to download all the data in each individual table.
A brief search led me to Django Import-Export . This library gave me a quick and easy method to allow users to download table data in a variety of formats, such as CSV, JSON, and YAML, to name a few. It’s quite easy to use after adding it to the INSTALLED_APPS section in Django’s settings.py file.
INSTALLED_APPS = [
. . .
‘import_export’,
]
After that, I had to modify my app’s admin.py file by adding a “Resource” class for each model. This would include import-export’s ModelResource class.It’s also required to add the ExportMixin to the ModelAdmin class.
from import_export import resources
from import_export.admin import ExportMixin
class UserInfoResource(resources.ModelResource):
class Meta:
model = UserInfo
class UserInfoAdmin(ExportMixin, admin.ModelAdmin):
. . .
With these few lines of code, the Model’s database table can now be downloaded in the admin screen. Keep in mind that the above “Resource” needs to be done for each model that you want to be made available for download.
With that finished, you will now have an Export button available in the changeview page, which will take you to the “Export” page where the user can select from a number of different options to export your database table and download it.
But what if you want to limit the choices?
Fortunately, this is easy to do. In fact, I implemented this so that I could limit the choice to just one datatype (CSV).
To accomplish this, I made a custom Mixin that calls django-import-export’s ExportMixin as its source, and modified the get_export_formats function.
**def get_export_formats(self):
formats = (
base_formats.CSV,
)
return [f for f in formats if f().can_export()]**
Replace each Model Admin’s ‘ExportMixin’ with your own custom mixin, and you’re good to go.
Grabbing all CSVs and packaging them into a ZIP file
Now comes to complicated part. I needed to make all the database tables downloaded into a zip file. The original request was just to do something similar to a pgdump; essentially dump all the data into one file and be done with.
But would you want to look at a single file full of data? That would be hard to read! I know I wouldn’t want to deal with that, so I didn’t want to expose the end-user to that either.
I needed to accomplish the following:
- Add a ‘Download’ button to the main admin page.
- On click, do the following: Add all of the tables into memory.
- Convert them into CSV format.
- Add them to a ZIP file.
- Initiate the download.
Download Button
The first step was to modify the admin template, specifically index.html.
That was just the start. How do I get the button to do something? At first, I thought about using javascript to accomplish that. However, I wanted to try and avoid using other languages if possible.
After some google-foo and research, I found that the best way was to use window.open(function) to call the function ‘download_zip’.
{% extends "admin/index.html" %}
{% load i18n admin_urls static %}
{% block content %}
<div>
<input type="button" value="Download" onclick="window.open('download_zip')"
</div>
{{block.super}}
{% endblock %}
ZIP!
Time to get down and dirty.
This took quite a bit of time. After some searching, I came across a few examples of how to make a zip file using Python. However, I also needed to combine this with django-import-export’s functions.
First, to make a csv file, you need to take the Resource, export it, and convert it into csv. Fortunately, import-export makes this easy to do.
resource = UserInfoResource()
resource = resource.export()
resource = resource.csv
Since I had several tables of data, I made a list containing all of the Resource classes.
file_list = [
UserInfoResource(),
. . . ,
]
I made a separate function that focused on taking that list of Resources and converted them into separate CSV files. These files were placed into a python dictionary. Each file had a key that was the name of the model, with a date appended to the end.
def getfiles():
fileSet = {}
date_now = dt.datetime.now().strftime('%Y%m%d%H%M')
for file in file_list:
dataset = file.export()
dataset = dataset.csv
name = file._meta.model.__name__ + date_now
fileSet[name] = dataset
return fileSet
Since I had several tables of data, I made a list containing all of the Resource classes.
file_list = [
UserInfoResource(),
. . . ,
]
Download_Zip
Zipfile is part of the Python library. You can read the documentation here .
This was probably the most difficult part. I followed a couple of examples online here and there about making a zip file, but I needed to get it to work with my CSV files. To accomplish this, I paired up with my co-worker who lives in Japan. We spent about two hours pair-programming to make a successful function!
Upon clicking the Download button on the admin page, this download_zip() function is called. First, it grabs the dictionary of CSV files from the getfiles() function.
def download_zip(request):
files = getfiles()
. . .
Next, we had to make an io file object, from Python’s io module.
Because we wanted to prevent the function from saving any files onto the server, we needed to create the file object in-memory. Now, I’m still new to this, but I believe this is why making an io.BytesIO file object is useful.
def download_zip(request):
files = getfiles()
zip_buffer = io.BytesIO()
. . .
Now for the ZipFile() object, which is this:
_class_ zipfile.ZipFile(_file_, _mode_, _compression_, _allowZip64_, _compresslevel_)
file is a path to the file; in our case, we want to write into our bytes-like file object, zip_buffer.
mode can be r, w, a, or x. We want it to be a, (append) since we want to add multiple files to a single zip file.
compression is the type of compression used. In our case we used Zip_Deflated.
allowZip64 is True by default, but we made it False. If True, and if the zip file is over 4GB in size, it would add a Zip64 extension.
Finally, we have compresslevel. This sets the level of compression, and accepts an integer from 0 – 9, though we didn’t assign a value here.
And so the zipfile.ZipFile() function looks like this, with a simple for loop inside of it to loop through our files dictionary and append each file into the Zip file.
with zipfile.ZipFile(zip_buffer, "a", zipfile.ZIP_DEFLATED, False) as zip_file:
for k, file in files.items():
zip_file.writestr(k + '.csv', file)
Before we can return it in an HttpResponse for download, we have one more command to implement.
_zip_buffer.seek(0)_
The seek method is used here to return the “cursor” position back to the beginning of the file, which is the zip_buffer in-memory object in our case. You can read more about seek here .
Interestingly, during testing, when we removed this command and attempted to return it in the HttpResponse, it returned an empty zip file.
I’m not 100% positive on why this is, but I believe that because the file is in-memory, and not saved to disc, without first resetting the position to zero first, the HttpResponse only sees empty space and returns that.
Finally, we just have to return the response!
resp = HttpResponse(zip_buffer, content_type='application/zip')
resp['Content-Disposition'] = 'attachment; filename = %s' % zip_filename
return resp
The Complete Code
file_list = [
. . .,
UserInfoResource()
]
def getfiles():
fileSet = {}
date_now = dt.datetime.now().strftime('%Y%m%d%H%M')
for file in file_list:
dataset = file.export()
dataset = dataset.csv
name = file._meta.model.__name__ + date_now
fileSet[name] = dataset
return fileSet
def download_zip(request):
files = getfiles()
zip_filename = 'Survey_Data' + dt.datetime.now().strftime('%Y%m%d%H%M') + '.zip'
zip_buffer = io.BytesIO()
with zipfile.ZipFile(zip_buffer, "a", zipfile.ZIP_DEFLATED, False) as zip_file:
for k, file in files.items():
zip_file.writestr(k + '.csv', file)
zip_buffer.seek(0)
resp = HttpResponse(zip_buffer, content_type='application/zip')
resp['Content-Disposition'] = 'attachment; filename = %s' % zip_filename
return resp
That’s it!
We now have an easily maintainable function that allows users to download all their database tables at once, packed into a zip file from Django’s built-in admin page. Furthermore, if the user only wants to download an individual table, they just need to visit the page for that database.
This code is open source, so feel free to use it.
Until next time!