Python Saved My Work

This is a short technical tale of how Python saved my work. It includes some lessons learned and some mistakes that one should avoid in the future.

Python Saved My Work
Photo by Hitesh Choudhary

This short piece is a real-life situation and experience of how I messed up the COVID-19 Database and fixed it all with Python.

As you might have heard or seen, when the pandemic (Coronavirus) began in Zimbabwe in early March of 2020, we launched the COVID-19 Hub for Zimbabwe initiative under African Surveyors Connect.

This project has been of a great deal to me as it has opened my mind and helped me build my skillsets when it comes to Geographic Information Systems and applying these skills in times of need for the health sector.

Cutting the long story short, just a few months ago while updating the database through the ArcGIS Online portal using a Geo-Form, I accessed the whole database with the intent to sort things out and make a few changes. Not knowingly I then accidentally set the date column for the COVID-19 time series data to the current date-time using the CURRENT_DATE() function within the platform. This resulted in a messed up dataset which had all the cumulative records set to the current date and time which mean't all the data was nolonger meaningful at all.

Luckily enough, I always make sure I back up this data to a GitHub repository for public access which is for anyone who might need this data for Non-Commercial usage.

As I said: this is a short story about how Python can fix such cases when dealing with large amounts of datasets within the ArcGIS Online platform.

As you might know, the Graphical-User Interface (GUI) within the ArcGIS Online platform does not allow you to delete all records in a Feature(Hosted)Layer. This presents a huge problem especially in my case when I needed to clear out all the data within the Feature(Hosted)Layer.

Overwriting the Feature Layer? Unfortunately, for the Overwrite Feature Layer option to show up, you need to have the source file for which the Feature Layer was published and I deleted this file.

What a mess right?

Photo by Yogendra Singh

Getting to the main story now.

The Python solution

Using the ArcGIS API for Python and Jupyter Notebooks, you can fix all the problems outlined. If you do not have these packages, you can get the Python API from the ESRI Developers platform including instructions on how you can use it. For Jupyter Notebooks, you can download Anaconda which comes with all the much-needed packages to get you started.

Once all that is set up has been done, let's get right into the coding section where we are going to be writing some commands which will help solve this mess.

Within your Jupyter Notebooks, let's import the relevant libraries which we want to use for this task.

from arcgis.gis import GIS
from arcgis import features

Let's login to our ArcGIS Portal (Organizational account).

gis = GIS("https://<organization_name>.maps.arcgis.com/", '<username>', '<password>')

Be sure to replace

After a successful login, let's run a search query and store the Feature Layer (the one we need to replace) in a variable.

covid_layer = gis.content.search("title: <name_of_the_feature_layer>, item_type="Feature Layer"
display(covid_layer) 

Information about the search results will be displayed. Depending on the number of search results retrieved I just want to make sure I am going to overwrite the right layer here. So I am going to created another variable to store this layer and then using the indexes to get the correct layer from the search result.

Lets call that search result and use the index just to be sure. My layer was the first result that showed up, so using the Indexes that will get a value of Zero (0).

Always remember and take note that indexes start at zero (0)

to_overwrite = covid_layer[0]
display(to_overwrite) 

Let's import another python module from the ArcGIS API for Python which handles Feature Layers.

from arcgis.features import FeatureLayerCollection

Now this is the part where we are going to be writing some long pieces of code. I will explain this at the end of the snippet.

feature_layers = to_overwrite.layers

max_objid = feature_layer.query(out_statistics=[{"staticsticType":"MAX","onStatisticField":"OBJECTID","outStatisticFieldName":"MAX_OBJ"}], return_geometry=False)
maxoid = max_objid.features[0].attributes['MAX_OBJ']

i = 0
step = 2000

while i <= maxoid:
  i += step
  feature_layers.delete_features(where=f"OBJECTID <= {i}")
  print(i)

print("Feature Layer was reset and erased sucessfully")

With the few lines of code above we did several things.

If you see the printed message at the end of the notebook it means our code was a success.

Return to the Feature Layer and check for any records. No records will be found within.

Using the backup you already had, you can then select the Append Data to Layer option and append the data to the layer without resetting any layer IDs and configured parameters that were in place.

No alt text provided for this image

There are probably many ways to do this, but I found this an easy way to continue with the work you have been doing. I also learned quite a number of lessons from this event. Some of them I have listed below which I believe most of the people reading this will get one or two.

Lessons:

  • always backup your data
  • in ArcGIS Online, NEVER delete the layer definition file when creating a feature layer.
  • know your Python scripts for troubleshooting
  • spend less time in the actual database. Normalize automating most of the tasks and processes.
  • Never open the database view when you are not sure about what you want to do or the operations you want to perform.

I hope this tutorial was very helpful in case anybody might run into the same trouble that I went through.