hi,
i'm working on a data warehouse and am deciding whether to use hadoop or mysql.
the dataset is currently likely to be no bigger than 40gb for the first year, then perhaps 80gb for the next year, and possibly 120gb the year after.
we want to be able to query all of the data at any point in the future - we aren't interested in throwing data away since we can't envisage how we might want to use it.
so, would hadoop be the right choice? i don't need high availability since this will be a back-office application, and the number of different queries won't be problematic - if staff want to perform queries, they don't really need real-time results. would it be better to just have a reasonably powerful mysql server just grinding through the data? at what point does it become useful to use hadoop - initially we won't need more than a single node.
any help would be appreciated.
thanks
Tags:
Share
Facebook
-
▶ Reply to This