Creating an Airbnb-like Data Mart

Airbnb was founded in October 2007, stemming from personal experiences of encountering overpriced shared apartments and fully booked hotels during a crowded conference in San Francisco. The original name, Airbedandbreakfast, was later shortened to Airbnb in 2009. Serving as an online platform, Airbnb facilitates the connection between hosts and guests and assumes responsibility for managing the booking process.

Transactions are conducted through the platform, with guests making payments for their bookings using credit cards. To ensure that the accommodation matches its description on the platform, hosts receive payment 24 hours after the guest’s arrival. In 2013, Airbnb generated revenue by charging a commission of 6-12% from guests, and 3% from hosts, and amassed a total of $150 million from approximately 10 million overnight stays.

Every user, whether a host or a guest, creates a profile page on Airbnb. Hosts are required to upload at least one photo and provide a phone number, while guests need to provide more comprehensive information. Hosts have the option to describe their accommodations using text and photos. Both guests and hosts can rate each other based on their experiences. The platform also features a calculator function that allows users to estimate their expected income from their own accommodations.

To improve my SQL skills, I created a proof-of-concept for the Airbnb use case in the form of a data mart with the following list of requirements:

  • Entity Relationship Model (ERM): Development of an ERM that encompasses the definition of data tables, attributes, and relationships among entities pertaining to the Airbnb use case.
  • Database Management System (DBMS): Selection of a suitable Database Management System (DBMS), such as MySQL or any other SQL-based system, to implement the database for handling Airbnb-related information.
  • Database Structure: Establishment of an appropriate database structure that facilitates efficient storage and processing of data relevant to the Airbnb use case.
  • Dummy Data: Generation of realistic dummy data to populate the database, ensuring that it accurately reflects the various scenarios and characteristics encountered in real-world Airbnb situations.
  • Query Functionality: Creation of well-crafted queries that enable seamless retrieval and presentation of data, effectively showcasing the functionalities of the database.
  • Documentation: Thorough documentation of all implementation steps, including the development of the ERM, database structure, and SQL statements employed, ensuring comprehensive records of the entire process.
  • Data Storage Optimization: Implementation of data storage optimization techniques, including proper normalization of the database structure, to minimize redundancy and optimize data storage for essential information.

In the following sections, we will take a look at how the project was conceptualized and developed.

Conception

The first step of the project was to derive an appropriate database design. The database design process includes a crucial stage called database modeling, which holds the utmost significance in the overall database design. Any oversights or omissions during this step can have a detrimental impact on the implementation stage, potentially rendering the database useless.

To initiate the process, the first step involves crafting a comprehensive requirements specification for the Airbnb project. This specification document should encompass a thorough analysis of the requirements, delving into the following aspects in greater detail:

  • Identification of various roles or user groups involved in the system.
  • Specification of the actions or activities performed by each of these roles.
  • Determination of the necessary data and functions required to fulfill the system’s requirements.

The results are compiled in this PDF file.

Development

The next step was to implement the data mart. Throughout this phase, I ensured that all SQL statements were meticulously documented in my database file.

Specifically, I delivered the tables and relations for the database in a SQL data file, aligning them with the concepts outlined in the ER-Diagram. Additionally, I diligently documented every SQL statement involved in the creation process.

To ensure a comprehensive test of the system, I made certain that each table contained a minimum of 20 entries. Furthermore, I created at least one test case for the database, specifically designed to validate its functionality based on the ER-Model.

By adhering to the provided guidelines and ensuring the documentation and testing of the system, a first working version of the data mart was derived. A detailed explanation of the database design and implementation procedure can be seen in this PDF.

Finalization

In this final phase, the goal was to polish and refine the database management system, after having received feedback. Given that only a minor change to the schema of the database was necessary and the first iteration of the data mart was up to standard, this last phase was rather short. This document sums up the project and shows the changes to the database schema.

Conclusion

As for my last application, this project was a great opportunity to gain development experience and also to make a deep dive into pure SQL. I really like the final version of the data mart and especially the location search functionality. The code for the data mart and instructions on how to use it can be found in this GitHub repository.