In this new post to his site Freek Van der Herten shares a time when he was working on a data import that ended in some unexpected results thanks to an interesting race condition.
Recently I was working on a client project where a data import was performed via queues. Each record was imported in its own queued job using multiple queue workers. After the data import was done we had more rows than expected in the database. In this blogpost I'd like to explain why that happened.
He starts by digging into the code that made use of the
firstOrCreate method in Laravel's Eloquent handling to find if an entry had already been created for the given data. The method uses two queries, one to determine if the record exists and another to create it if not. The issue was with the fact that it was being handled in a queue meaning that the select could happen and return false while another process was creating the record. He even created a demo app to show it happening and includes screenshots showing the result. He recommends moving the process to a separate queue and having only one worker executing at a time. There's not a good code-based solution for it as it's more of an issue with the architecture than the application itself.