Do you remember our idea of exploring some factors that influence on popularity and rating of movies

Alejandro Rozo
Apr 23, 2020
2 min read

If you do not! A short recall: Our idea was to present a scatter plot in which the budget and the revenue were the axis and using some channels add other characteristics such as genre, popularity, and rating. Additionally, we wanted to include a filter to visualize movies for a particular decade or lustrum.

How this idea evolved?

We realized that presenting a scatter plot with all the 2000 points did not provide us any insight about a possible association. Neither if we used the logarithmic scale.

Our next step was to separate the information and to make a facetting by lustrum. What was the result? We did not get any insight.

What else we could do? Our problem was the number of data points, so our next step was to aggregate!. We aggregated by genre and plot those points. Now we were facing a lack of comparability in our main variables (popularity and rating), all the genres look very similar.

We had to redefine our idea! What came up?

A circle packed visualization: each movie is a circle, its size could represent its revenue or budget, the color hue differentiates among genres, and all the movies for the same genre are packed in a bigger circle.

One of the advantages of this new proposal is that now we can compare the revenue or budget among movies and genres.

What are our next steps?

Firstly, we will explore what other information we can include. Secondly, how many data points should we take into account, and thirdly if it is possible to add some temporal comparability?

In our next posts ... How should be structured our data to make a circle packed visualization?

Do you remember our idea of exploring some factors that influence on popularity and rating of movies

Recent Posts

Comments