This is the nature of maintaining software. You step into a role and are handed a repository of code. There may be a lot of it, and it’s quality is questionable at best. There may even be a ton of cruft. I recently encountered one such repo involving a 7 year old WordPress Multisite. Additionally, there were a number of custom database tables and php CRUD apps were built alongside & intertwined with this MU instance. About a year before I joined the project, the old Multisite instance became the basis for a new website, and a number of themes, plugins, library code, and log files were unnecessarily added to the new repository. Even if one performs
git rm, those files will still remain in the history and would be downloaded with every new clone. Since it was still relatively early enough in the project’s history (and that I was the only active developer), I decided to try some more advanced git magic to purge these files.
Enter git filter-branch
An Alternate Approach
If you have more sophisticated needs, there is a tool that builds upon git-filter-branch
Another use case
I wanted to move a small repository of helper scripts and template config files to my primary github account. The problem was that all my username and emails were for the source git account. I found this github help article outlining how you can change the user name and email of a committer and author within your history.