Do You Trust Google Big Query with Your Big Data?
Google has come up with a fantastic service to analyze large amounts of data. It’s called BigQuery
and it allows you to run analysis on big data on the cloud. As
expected, the tool has a superb, intuitive web UI. The data analysis
language uses SQL like queries. (Hive, anyone
). Have a look at the Big Query Tutorial, it looks pretty neat. So, now all you need to do to run queries is to upload your data to Google using the form shown below.
It allows you to upload a file or point to it using Google’s cloud storage.
Now, the interesting question here is that to analyze using BigQuery how much of that data are you willing to give Google? And how long will that take? The answer won’t be “Let me quickly upload a 500 GB file and run some queries”. That amount of data would definitely take some time to upload. So, effectively, this SaaS becomes pretty useless as more and more data volumes need to be uploaded for analysis.
Everyone trusts Google (
), so this concern might be easily ignored. But a potential other
problem I see is the “Privacy Policies” that are violated. Usually, when
you want to analyze data, it can contain sensitive data such as user
behavior patterns and so forth. How comfortable will your customers be
if you hand that data over to Google? Even anonymizing this data might
not save you from a potential legal breach.
I still believe setting up your own data analysis and monitoring platform is the best way to go. Thoughts? I’d love to hear them.
(Note: Opinions expressed in this article and its replies are the opinions of their respective authors and not those of DZone, Inc.)





