Documente Academic
Documente Profesional
Documente Cultură
Data access
The majority of big data is used for commercial purposes to increase profits, provide better
services, or gain competitive advantage. Thus, organisations are hesitant to share their data
with outsiders. Even when organisations allow access to their data, they usually restrict
access to certain portions of the data or impose rate limits on the amount of data that can be
accessed per day or user. This makes it difficult for researchers and non-profit organisations
to obtain data, but also for organisations to integrate their own data with other organisations’
data. However, many countries nowadays promote ‘open data’ portals, where datasets are
made available to the public.
Heterogeneity of data
Heterogeneity of data refers to how much the data differs across the dataset we are looking
at. This can include differences in data format, number of missing values, level of detail, or
length of time period for which data is available.
Heterogeneity is a particular issue when we bring together data from unconnected sources.
For example, it may be useful to connect population data from government sources with data
from environmental sensors to determine action towards a drinking water management plan
for a city. The data from these different sources will need to be carefully matched to ensure
valid analysis results.
Data privacy and protection is not just important for individuals. Organisations also need to
have their data and intellectual property protected by policies and laws