Questoes Parte 2

Questão 1
Falso ou verdadeiro?
O Cloud Datalab é uma poderosa ferramenta interativa criada para explorar, analisar,
transformar, visualizar dados e criar modelos de aprendizado de máquina no Google Cloud
Platform. Ele funciona no Google Compute Engine e se conecta facilmente a vários serviços de
nuvem para que você possa se concentrar nas tarefas de ciência de dados.
O Cloud Datalab é desenvolvido no Jupyter (anteriormente conhecido como IPython), que

conta com um ecossistema eficiente de módulos e uma robusta base de conhecimento.
O Cloud Datalab permite a análise dos seus dados nas ferramentas Google BigQuery, Cloud
Machine Learning Engine, Google Compute Engine e Google Cloud Storageusando Python, SQL
e JavaScript (para funções definidas pelo usuário do BigQuery).
Escalável
O Cloud Datalab oferece cobertura para análise desde megabytes até terabytes. Consulte
terabytes de dados no BigQuery, gere uma análise local de dados de amostra e execute jobs de
treinamento em terabytes de dados no Cloud Machine Learning Engine com facilidade.
Questão 2
Dropout is a regularization technique for reducing overfitting in neural networks by preventing

complex co-adaptations on training data. It is a very efficient way of performing model
averaging with neural networks. The term "dropout" refers to dropping out units (both hidden
and visible) in a neural network.
Questão 3
The Cloud Pub/Sub API exports usage metrics that can be monitored programmatically or
accessed via Stackdriver Monitoring. You can create dashboards, set up alerts, and set up an
autoscaler to manage publisher or message processing (subscriber) instances running on
Compute Engine.
Note that although Cloud Pub/Sub itself is scaled automatically, you are responsible for
managing your Pub/Sub quota.
V
Questão 4
Leia o link abaixo
https://cloud.google.com/logging/docs/export/
Can you export copies of some or all of your logs outside of Stackdriver Logging?
Yes
Questão 5
Verdadeiro ou falso?
Google Persistent Disk Armazenamento em blocos rápido e flexível
O Google Persistent Disk é uma solução de armazenamento em blocos durável e de alto

desempenho para o Google Cloud Platform. O Persistent Disk fornece armazenamentos SSD e
HDD, que podem ser vinculados a instâncias em execução no Google Compute Engine ou no
Google Kubernetes Engine. Os volumes de armazenamento podem ser redimensionados de
modo transparente, rapidamente gravados em backup e oferecer suporte a leitores
simultâneos.
Questão 6
O Stackdriver agrega métricas, registros e eventos de infraestrutura, oferecendo a

desenvolvedores e operadores um conjunto avançado de sinais observáveis que agilizam a
análise da causa raiz e reduzem o tempo médio para resolução (MTTR, na sigla em inglês). O
Stackdriver não requer integração extensiva ou vários "painéis" nem obriga os
desenvolvedores a usarem um provedor de nuvem específico.
O Stackdriver foi criado do zero para aplicativos em nuvem. Não importa se você está
executando Google Cloud Platform, Amazon Web Services, infraestrutura local ou nuvens
híbridas, o Stackdriver combina métricas, registros e metadados de todas as contas e projetos
em nuvem em uma única visualização abrangente do ambiente. Dessa maneira, você pode
entender rapidamente o comportamento do serviço e entrar em ação.
Integração nativa com ferramentas de dados do Google Cloud, como BigQuery, Cloud Pub/Sub,
Cloud Storage e Cloud Datalab, além de integração imediata com todos os outros
componentes do aplicativo.
A visualização sofisticada e o sistema de alertas avançado ajudam a identificar problemas
rapidamente, até mesmo os de difícil diagnóstico, como contenção de hosts, limitação do
provedor de nuvem e desgaste de hardware. A integração com serviços conhecidos, como o
PagerDuty e o Slack, agiliza a resposta a incidentes. O acompanhamento e a geração de
relatórios de erros, assim como a geração de registros integrada, permitem o detalhamento e
a análise da causa raiz.
Questão 7
Sobre o conceito de Continuous Learning
Verdadeiro ou falso ?
Retraining Models on New Data
For a model to predict accurately, the data that it is making predictions on must have a similar
distribution as the data on which the model was trained. Because data distributions can be
expected to drift over time, deploying a model is not a one-time exercise but rather a
continuous process. It is a good practice to continuously monitor the incoming data and retrain
your model on newer data if you find that the data distribution has deviated significantly from
the original training data distribution. If monitoring data to detect a change in the data
distribution has a high overhead, then a simpler strategy is to train the model periodically, for
example, daily, weekly, or monthly.
Verdadeiro
Questão 8
What is the “dropout” technique?

The original paper1 that proposed neural network dropout is titled: Dropout: A simple way to
prevent neural networks from overfitting. That tittle pretty much explains in one sentence
what Dropout does. Dropout works by randomly selecting and removing neurons in a neural
network during the training phase. Note that dropout is not applied during testing and that the
resulting network doesn't dropout as part of predicting.
This random removal/dropout of neurons prevents excessive co-adaption of the neurons and
in so doing, reduce the likelihood of the network overfiting.
The random removal of neurons during training also means that at any point in time, only a
portion of the original network is trained. This has the effect that you end up sort of training
multiple sub-networks, for example:
It is from this repeated training of sub-networks as opposed to the entire network where the
notion of neural network dropout being a sort of ensemble technique comes in. I.e the training
of the sub-networks is similar to training numerous, relatively weak algorithms/models and
combining them to form one algorithm that is more powerful than the individual parts.
References:
1
: Srivastava, Nitish, et al. "Dropout: A simple way to prevent neural networks from overfitting." The Journal of Machine Learning Research 15.1
(2014): 1929-1958.
Verdadeiro
Questão 9
What is Dimensionality Reduction?

In machine learning classification problems, there are often too many factors on the basis of
which the final classification is done. These factors are basically variables called features. The
higher the number of features, the harder it gets to visualize the training set and then work on
it. Sometimes, most of these features are correlated, and hence redundant. This is where
dimensionality reduction algorithms come into play. Dimensionality reduction is the process of
reducing the number of random variables under consideration, by obtaining a set of principal
variables. It can be divided into feature selection and feature extraction.
Verdadeiro

Questoes Parte 2

Încărcat de

Informații document

Drepturi de autor

Formate disponibile

Partajați acest document

Partajați sau inserați document

Opțiuni de partajare

Vi se pare util acest document?

Este necorespunzător acest conținut?

Drepturi de autor:

Formate disponibile

Questoes Parte 2

Încărcat de

Drepturi de autor:

Formate disponibile

Questão 1

O Cloud Datalab é desenvolvido no Jupyter (anteriormente conhecido como IPython), que

Dropout is a regularization technique for reducing overfitting in neural networks by preventing

Leia o link abaixo

Google Persistent Disk Armazenamento em blocos rápido e flexível

O Google Persistent Disk é uma solução de armazenamento em blocos durável e de alto

O Stackdriver agrega métricas, registros e eventos de infraestrutura, oferecendo a

Sobre o conceito de Continuous Learning

Retraining Models on New Data

What is the “dropout” technique?

What is Dimensionality Reduction?

S-ar putea să vă placă și