Blog > Big Data > Post

The Fact behind the Feminist Marvel Films

Wednesday, April 24, 2019

With the popularity of Wonder Woman, Jessica Jones, Black Widow, and Captain Marvel, I wonder if there is any indication that shows a transformation of gender equality in the film industry. I am going to analyze gender ratio in actors across 2000 films in recent two decades. We will be able to see the changes in female roles in the film industry and the relation of female social status. Then we will look closely into the gender of celebrities in Marvel movies since they are representative among heroic movies.



The idea is to use gender analysis on celebrity names, and count the gender occurrence from each movie each year using python.


First, scrape the movie information from box office mojo using Octoparse. I have been using it for many times since it’s fully free with unlimited scrapable pages. The URLs of the yearly box office in mojo follows a constant pattern with a fixed hostname and a year tag at the end. For example, the URLs of the box office is https://www.boxofficemojo.com/yearly/chart/?yr=2019&p=.htm in 2019 and https://www.boxofficemojo.com/yearly/chart/?yr=2018&p=.htm in 2018. That said, if we follow this pattern we should be able to get a list of URLs from 2000 to 2019 like this:




Load this list with Octoparse. It will automatically create a loop extraction list. Octoparse will guide you to create another extraction list of movies in a year, and click to extract data including Title, Actors, Distributors, Domestic_Total_Gross and Foreign_Gross. About 20 minutes later we get all the details of 2000 films in 20 years.




Second, massage the data using python so the text gets tokenized.




Third, get the numbers of female and male actors in the movie of a year. To do this, I loaded a list of gender dictionary which analyzes the first name and returns the gender.




After getting the list, I visualized the data as below.




Two lines move in the same direction. Both lines move upward before 2010, reaching their peaks in 2011, and moved downward since then. The number of actors is shrinking in general. It might indicate a downfall in the film industry. The gap between the two is showing a tendency of closing in general, yet the space between 2011 and 2015 has widened. Which said, gender disparity is entrenched in the film industry. The number of male actors is more than doubled that of female actors, even though it shows the disposition towards equality in numbers of female and male actors.


What about the Marvel?



In contrast, both lines move upward since 2012, and there is a steep increase between 2012 and 2013. Heroic movies are getting popular during the economic recovery. Moreover, female actors show an instant increase comparing with the numbers before 2012. It may speak to the fact that the film industry attempt of introducing more female actors into the hero series. The resurrection of the economy in 2012 plays an important role in balancing the numbers in heroic movies. The figure of Hero represents the national identity which contains the idea of “Freedom” and “Democracy.”  Women start to move the plot forward rather than support the leading actor.  Divergent (2014) and Rogue One: A Star Wars Story (2016), The Hunger Games (2012), Lucy (2014), Mad Max: Fury Road (2015), Wonder Woman, we are having a different type of superhero women on the screen, the popularity of superheroines clearly speaks women’s social status shift towards an imperative role of redemption.


Superhero movies have become the icon of crime-fighting, social righteousness, self-sacrificing and most importantly -- the male empowerment since its introduction in the 1930s. The figure a superhero is so successful that people are implanted with this idea of a man is born to be a lifesaver.  I can’t tell how appreciated I am that there are less Marry Jane, who was such a delicate, beautiful but weak lady who was meant to be caught by the villain and saved by Spiderman. I expect more women of color as strong, independent like Furiosa in Mad Max: Fury Road and Captain Marvel who are their own heroes.


Author: Ashley

Ashley is a data enthusiast and passionate blogger with hands-on experience in web scraping. She focuses on capturing web data and analyzing in a way that empowers companies and businesses with actionable insights. Read her blog here to discover practical tips and applications on web data extraction

Si desea ver el contenido en español, por favor haga clic en:  5 Razones por El Web Scraping Puede Beneficiar a Su Negocio


We use cookies to enhance your browsing experience. Read about how we use cookies and how you can control them by clicking cookie settings. If you continue to use this site, you consent to our use of cookies.
Accept decline