A data lake is essentially a massive lake of raw, unstructured data. In a data lake, the use case for the data has not yet been determined and the possibilities are endless. Data can be transformed any way the user needs, which makes it especially good for data analysis. Since data in a data lake is unstructured, it can support all file types, including pictures, videos, logs, and more. Typically a data lake is going to be well-suited for data scientists and analysts. They are flexible and highly accessible. Finally, data lakes can provide faster insights into data and can be easily transformed to fit a data scientists' needs. Keep in mind, data lakes can easily become data swamps, without the right regulations.
On the other hand, a data warehouse is going to have structured data with a well-defined use case. Changing the structure of the data in a data warehouse is time-consuming and can be expensive, so it is best for business professionals who are often seeking specific insights. The data in a data warehouse is constantly being used and is highly relevant to the day-to-day running of a business, whereas the data in a data lake can sit there for years and not be utilized.
This can be a difficult question to answer because it really depends on your company's goals. Data lakes can store more data, as well as more varieties of data, and typically cost less money. They are flexible and unstructured, making them a great choice for data scientists, but a bit more difficult to use for most business professionals. A data warehouse on the other hand has its specific use cases already defined and structured, making it simple for business professionals to get the insights they need. With this structure comes a downfall, however. If there is new information that needs to be brought in, it could be quite costly to change the structure of the data warehouse, whereas that new data could easily be added from a data lake
Overall, a data lake is going to be a better, more flexible environment for the majority of companies. That being said, if your company already has a well-established data warehouse, there is no reason to scrap it. In fact, having both a data lake and a data warehouse might be the best way of managing big data. You have the best of both worlds and everyone in your organization, whether data scientists or business professionals, will thank you.