0 evaluări0% au considerat acest document util (0 voturi)
178 vizualizări3 pagini
Stephen Baker's "numerati" analyzes large volumes of data collected electronically. In a book whose subject is data, equations, and mathematical models, Baker is surprisingly shy. In the end, however, Baker's book is entertaining and informative.
Stephen Baker's "numerati" analyzes large volumes of data collected electronically. In a book whose subject is data, equations, and mathematical models, Baker is surprisingly shy. In the end, however, Baker's book is entertaining and informative.
Drepturi de autor:
Attribution Non-Commercial (BY-NC)
Formate disponibile
Descărcați ca PDF, TXT sau citiți online pe Scribd
Stephen Baker's "numerati" analyzes large volumes of data collected electronically. In a book whose subject is data, equations, and mathematical models, Baker is surprisingly shy. In the end, however, Baker's book is entertaining and informative.
Drepturi de autor:
Attribution Non-Commercial (BY-NC)
Formate disponibile
Descărcați ca PDF, TXT sau citiți online pe Scribd
The Numerati marketable conclusions. “[T]hese mathematicians Stephen Baker and computer scientists,” Baker intones sternly, Houghton Mifflin Co., 2008 US$26.00, 256 pages “are in a position to rule the information of our ISBN-13: 978-0618784608 lives.” In a book whose subject is data, equations, and mathematical models, Baker is surprisingly Is it possible for a nonmathematician to write shy about presenting any actual mathematics. Or both accurately and entertainingly about a math- perhaps it is not so surprising. Steven Hawking ematical topic while still conveying something once wrote, “Someone told me that each equation nontrivial about the mathematics? The answer is I included in the book would halve the sales. I yes, but good examples are rare. Constance Reid therefore resolved not to have any equations at did the trick with her book about Hilbert, and, to all. In the end, however, I did put in one equa- a lesser extent, with her book about Courant, but tion, Einstein’s famous equation E = mc 2 . I hope she had the advantage of having Julia Robinson that this will not scare off half of my potential for a sister. And of course Martin Gardner, who readers.” [2] Baker has taken Hawking one equation had little formal mathematical training, wrote the further. “Mathematical Games” column of Scientific Amer- Baker’s approach is almost entirely anecdotal. ican for many years, and introduced the beauty All told, he interviews about two dozen of the of mathematics to many young readers, including numerati, ranging from IBM’s Samer Takriti to Ya- this reviewer. hoo’s head of research, Prabhakar Raghavan, and Stephen Baker, the author of The Numerati, is, asks them some not-very-revealing questions. We unfortunately, no Martin Gardner. By numerati (the learn very little about their personalities and even word apparently first appeared in a 1990 review of less about what it is they do on a day-to-day basis. a British art exhibit, written by Doron Swade) Baker Some of the anecdotes are, admittedly, inter- means the kind of people who, were they working esting. I particularly enjoyed the plan to “put in the financial industry, would be called “quants”: a wireless computer on half a million cows in people with very strong mathematical and com- Kansas”; with the data collected, researchers hope puter skills who can analyze real-world problems. to determine what behavior patterns of cows are While “quants” study financial markets and build correlated with higher-quality meat. But some are mathematical models, Baker’s numerati analyze not so interesting. Baker opens with a puzzle: why large volumes of data collected electronically, in do people who rent romantic movies online also order to make predictions about human behavior tend to click on an ad for rental cars, much more in a variety of spheres: voting, employment, con- than the average user? The answer, when it comes, sumption, crime, illness, blogging, and marriage. is not that surprising: lovers of romantic movies Each of these activities gets a chapter devoted to were attracted by the ads that promoted weekend it, in which Baker interviews several people who “escapes”. Jeffrey Shallit is professor of computer science at the There is very little in The Numerati to interest University of Waterloo, Ontario, Canada. His email ad- the professional or amateur mathematician; this dress is shallit@cs.uwaterloo.ca. is the kind of book that a business executive
October 2009 Notices of the AMS 1109
might buy in an airport bookstore, hoping to learn Web pages—are of immense value to advertisers”. something about mathematical modeling and the This is incorrect. The site Goto.com allowed adver- Internet—but I imagine even the business execu- tisers to bid on search results as early as February tive will find insufficient novelty in Baker’s modest 1998, two years before Google did so. Google’s survey. There’s just not enough detail provided to original noteworthy accomplishment—and the one tell the reader very much about the main subject: that made it the search engine of choice—was its the models and algorithms that extract meaning new algorithm, called PageRank, for deciding what from large volumes of data. Web pages provide good matches for a query. As an example, consider this passage: “If one PageRank represented the Web as a directed of Raghavan’s scientists gives an imprecise com- graph. Nodes are pages, and there’s a directed puter command while trawling through Yahoo’s edge from page A to page B if A links to B. In its data, he can send the company’s servers whirring simplest form, PageRank assigned a weight W to madly through the noise for days on end. But a the edge (A, B) with timely tweak in these instructions can speed up the hunt by a factor of 30,000. That reduces a number of links from B to A W = . 24-hour process to about three seconds. His point total number of pages that B links to is that people with the right smarts can summon The resulting square matrix, called the “link ma- meaning from the nearly bottomless sea of data. trix”, is column stochastic and has an eigenvalue It’s not easy, but they can find us there.” of 1. The associated eigenvector, if it is unique Reading this, I can only wonder, what is an “im- and suitably normalized, gives the “rank” or im- precise computer command”? Does the passage portance of each page. (There is now more actual concern a new breakthrough at Yahoo in search mathematics in this review than in all 244 pages optimization, or something obvious that every of Baker’s book.) To make this idea work well in undergraduate computer science student learns, practice, we need uniqueness of the eigenvector such as binary search? Baker just doesn’t give and a fast way to calculate it, so the mathematical enough detail to decide. story doesn’t end here. But even in its infancy, Baker emphasizes that the volume of data col- PageRank helped Google give much better results lected by the numerati requires new techniques, than other search engines—so good that Google’s but he doesn’t really explain why. It would have home page cockily offers an option labeled “I’m been nice to read something along these lines: if Feeling Lucky”, where only a single search result, we are working with small data sets, with hun- the top one, is revealed—that it quickly became dreds or thousands of items, we can afford to the search engine of choice. Although Google’s use algorithms that run in linear, O(n log n), or search engine has since moved far past PageRank, even quadratic time. But, as my colleague Alex a mathematically savvy writer could have easily López-Ortiz has noted [3], when you are dealing summarized these elementary ideas, or at least with 230 or even 240 data points, the log factor is referred to the paper of Bryan and Leise [1]. the difference between a query that completes in Even when a simple geometric diagram would a second and one that completes in half a minute. have enlightened the reader, Baker refuses to pro- Too often Baker relies on clichés. Over and vide it. In talking with Mark Steitz, a Democratic over, we are told that the goal of the numerati is to “turn us into dizzying combinations of num- consultant, he describes a “simplex triangle” that bers” (p. 13), to “turn IBM’s workers into numbers” represents voters in an election. Each voter is (p. 20), and that they will view people as “boiled represented by a point with two coordinates that down to numbers” (p. 23) or “represented as a represent (a) the likelihood of favoring one party series of numbers” (p. 35). Of these, only the last over another and (b) the likelihood of actually is accurate. Sometimes, though, Baker says we are going to the polls in any election. “Steitz draws a actually equations: “each of us [is] represented by vertical line up the triangle, a so-called isoquant. scores of equations” (p. 42); “I had ... no clue as Each voter along this line is of equal value, he to what kind of equation I would become” (p. 99). says.” Although I imagine every reader of this This, even metaphorically, seems incorrect. Peo- review could produce the diagram Steitz has in ple might be represented by numbers, and their mind, one picture here would be worth more than relationships might be governed by equations, but a hundred words. it makes little sense to claim that an individual’s In the chapter on politics, Baker discusses the attributes are represented by an equation. difficulty of obtaining good data on who people Although most of his account is accurate—as are likely to vote for. Because of this, “proxies” far as it goes—Baker does get some of the histo- are used; if you bought a Volvo and shop at Trad- ry wrong. He claims, for example, that “Google’s er Joe’s, you might be more likely to vote for a breakthrough, which transformed a simple search Democrat than someone who’s an NRA member engine into a media giant, was the discovery that and drives a pickup truck. Geographical proxies our queries—the words we type when we hunt for can be good predictors, too, but Baker’s account
1110 Notices of the AMS Volume 56, Number 9
is superficial compared to others, such as Michael Ultimately, I did not find The Numerati a very Weiss’s The Clustering of America [5]. satisfying account of its subject. I wanted more The contrast between this book and some re- insight—something that Baker, with his nonmath- lated ones published recently is startling. For ematical background, could not provide. Perhaps example, Emanuel Derman’s My Life as a Quant I am unfair in criticizing Stephen Baker for not [4] is a memoir of the author’s career as a physi- writing the book I would have wanted to read. The cist, computer programmer, and financial wizard. problem is, I don’t think he wrote the book that Along the way, Derman provides portraits of most people would have wanted to read. Tsung-Dao Lee, the physicist who co-discovered the asymmetry of the weak interaction with C. N. References Yang, and Fischer Black, co-creator of the Black- [1] Kurt Bryan and Tanya Leise, The Scholes equation for the value of an option. Here $25,000,000,000 eigenvector: The linear alge- bra behind Google, SIAM Review 48 (2006), is Derman on T. D. Lee: 569–581. ... every speaker felt compelled to [2] Steven Hawking, A Brief History of Time, Bantam focus on him; as they spoke, their Books, 1998. eyes fixated only on him, and he [3] Alejandro López-Ortiz, Algorithmic foundations let no statement he did not fully of the Internet, Combinatorial and Algorithmic Aspects of Networking, Lecture Notes in Comput- agree with pass him by. No mat- er Science, Vol. 3405, Springer, Berlin, 2005, pp. ter who lectured at the seminar, 155–158. T. D. concentrated intensely on [4] Emanuel Derman, My Life as a Quant: Reflections their argument, and interrupted at on Physics and Finance, Wiley, 2004. the first instant something was not [5] Michael J. Weiss, The Clustering of America, Tilden satisfactory. At times he broke in Press, 1988. on the initial sentence of the talk, refusing to let a speaker proceed until the point was clarified. Some- times clarification never came; I once witnessed the humiliation of a visting postdoc who was forced to defend the first sentence he ut- tered for the entire hour and a half allowed for his seminar. Derman’s writing is witty, insightful, and mov- ing; his prose is eloquent, and accurately captures the joys and sorrows of doing research. Derman’s book is not filled with equations, either, but he uses diagrams effectively to make his points, and describes, in a clear if nontechnical way, some of the ideas that excited him in physics and fi- nance. As someone who has actually worked in mathematics, physics, and finance, Derman writes with an authority and insight that Baker cannot approach. Very little of The Numerati is devoted to an analysis of the ethical and privacy concerns that data collection raises. Although Baker briefly dis- cusses one way of hiding from the numerati—an initiative called Attention Trust—he says almost nothing about technologies for cryptography and anonymity. Modern cryptography, which is strong- ly mathematically based, offers us the hope that many of our transactions can take place veiled from the prying eyes of the numerati. And anony- mous Web-surfing, based (for example) on tech- nology from anonymizer.com or the Tor project, can prevent data collectors from linking online behavior with the specific person who is doing the surfing.
Archibald Fripp Dr. Archibald Fripp is an electrical engineer and materials scientist. He was a senior scientist at NASA and managed many space shuttle experiments. He retired from NASA in 1998 but st.pdf
ChatGPT Money Machine 2024 - The Ultimate Chatbot Cheat Sheet to Go From Clueless Noob to Prompt Prodigy Fast! Complete AI Beginner’s Course to Catch the GPT Gold Rush Before It Leaves You Behind
ChatGPT Side Hustles 2024 - Unlock the Digital Goldmine and Get AI Working for You Fast with More Than 85 Side Hustle Ideas to Boost Passive Income, Create New Cash Flow, and Get Ahead of the Curve
ChatGPT Millionaire 2024 - Bot-Driven Side Hustles, Prompt Engineering Shortcut Secrets, and Automated Income Streams that Print Money While You Sleep. The Ultimate Beginner’s Guide for AI Business