Ebook610 pages6 hours

Scaling Apache Solr

Name: Scaling Apache Solr
Author: Hrishikesh Vijay Karambelkar
ISBN: 9781783981755

By Hrishikesh Vijay Karambelkar

Rating: 0 out of 5 stars

()

Read preview

About this ebook

This book is a step-by-step guide for readers who would like to learn how to build complete enterprise search solutions, with ample real-world examples and case studies. If you are a developer, designer, or architect who would like to build enterprise search solutions for your customers or organization, but have no prior knowledge of Apache Solr/Lucene technologies, this is the book for you.

Skip carousel

LanguageEnglish

PublisherPackt Publishing

Release dateJul 25, 2014

ISBN9781783981755

Author

Hrishikesh Vijay Karambelkar

Related authors

Skip carousel

Related to Scaling Apache Solr

Related ebooks

Skip carousel

Force.com Enterprise Architecture - Second Edition
Ebook
Force.com Enterprise Architecture - Second Edition
byAndrew Fawcett
Rating: 1 out of 5 stars
1/5
Force.com Enterprise Architecture
Ebook
Force.com Enterprise Architecture
byAndrew Fawcett
Rating: 5 out of 5 stars
5/5
Learning ELK Stack
Ebook
Learning ELK Stack
byChhajed Saurabh
Rating: 0 out of 5 stars
0 ratings
Learning Apache Spark 2
Ebook
Learning Apache Spark 2
byMuhammad Asif Abbasi
Rating: 0 out of 5 stars
0 ratings
Practical Web Development
Ebook
Practical Web Development
byPaul Wellens
Rating: 5 out of 5 stars
5/5
Applied SOA Patterns on the Oracle Platform
Ebook
Applied SOA Patterns on the Oracle Platform
bySergey Popov
Rating: 0 out of 5 stars
0 ratings
Big Data Analytics
Ebook
Big Data Analytics
byVenkat Ankam
Rating: 0 out of 5 stars
0 ratings
Apache Solr for Indexing Data
Ebook
Apache Solr for Indexing Data
byHandiekar Sachin
Rating: 0 out of 5 stars
0 ratings
AWS Administration – The Definitive Guide
Ebook
AWS Administration – The Definitive Guide
byWadia Yohan
Rating: 5 out of 5 stars
5/5
Apache Solr Search Patterns
Ebook
Apache Solr Search Patterns
byJayant Kumar
Rating: 0 out of 5 stars
0 ratings
Learning Hadoop 2
Ebook
Learning Hadoop 2
byGarry Turkington
Rating: 4 out of 5 stars
4/5
Scaling Big Data with Hadoop and Solr - Second Edition
Ebook
Scaling Big Data with Hadoop and Solr - Second Edition
byHrishikesh Vijay Karambelkar
Rating: 0 out of 5 stars
0 ratings
Scala for Java Developers
Ebook
Scala for Java Developers
byThomas Alexandre
Rating: 5 out of 5 stars
5/5
OpenStack Object Storage (Swift) Essentials
Ebook
OpenStack Object Storage (Swift) Essentials
byAmar Kapadia
Rating: 0 out of 5 stars
0 ratings
Mastering Akka
Ebook
Mastering Akka
byChristian Baxter
Rating: 0 out of 5 stars
0 ratings
Administrating Solr
Ebook
Administrating Solr
bySurendra Mohan
Rating: 0 out of 5 stars
0 ratings
Instant Apache Camel Messaging System
Ebook
Instant Apache Camel Messaging System
byEvgeniy Sharapov
Rating: 0 out of 5 stars
0 ratings
Mastering SaltStack
Ebook
Mastering SaltStack
byHall Joseph
Rating: 0 out of 5 stars
0 ratings
Java EE 7 Development with WildFly
Ebook
Java EE 7 Development with WildFly
byFrancesco Marchioni
Rating: 0 out of 5 stars
0 ratings
PHP Oracle Web Development: Data processing, Security, Caching, XML, Web Services, and Ajax
Ebook
PHP Oracle Web Development: Data processing, Security, Caching, XML, Web Services, and Ajax
byYuli Vasiliev
Rating: 0 out of 5 stars
0 ratings
Elements of Android Room
Ebook
Elements of Android Room
byMark Murphy
Rating: 0 out of 5 stars
0 ratings
Infinispan Data Grid Platform Definitive Guide
Ebook
Infinispan Data Grid Platform Definitive Guide
byWagner Roberto dos Santos
Rating: 0 out of 5 stars
0 ratings
Amazon SimpleDB: LITE
Ebook
Amazon SimpleDB: LITE
byPrabhakar Chaganti
Rating: 0 out of 5 stars
0 ratings
Mastering Web Application Development with Express
Ebook
Mastering Web Application Development with Express
byAlexandru Vlăduțu
Rating: 0 out of 5 stars
0 ratings
Learning Force.com Application Development
Ebook
Learning Force.com Application Development
byChamil Madusanka
Rating: 0 out of 5 stars
0 ratings
Getting Started with Oracle WebLogic Server 12c: Developer’s Guide
Ebook
Getting Started with Oracle WebLogic Server 12c: Developer’s Guide
byFabio Mazanatti Nunes
Rating: 0 out of 5 stars
0 ratings
Oracle Modernization Solutions
Ebook
Oracle Modernization Solutions
byTom Laszewski
Rating: 0 out of 5 stars
0 ratings
Clojure High Performance Programming - Second Edition
Ebook
Clojure High Performance Programming - Second Edition
byKumar Shantanu
Rating: 0 out of 5 stars
0 ratings
Building RESTful Python Web Services
Ebook
Building RESTful Python Web Services
byGastón C. Hillar
Rating: 5 out of 5 stars
5/5
Oracle Coherence 3.5
Ebook
Oracle Coherence 3.5
byAleksandar Seovic
Rating: 4 out of 5 stars
4/5

Computers For You

Skip carousel

Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
Ebook
Data Science from Scratch: The #1 Data Science Guide for Everything A Data Scientist Needs to Know: Python, Linear Algebra, Statistics, Coding, Applications, Neural Networks, and Decision Trees
bySteven Cooper
Rating: 4 out of 5 stars
4/5
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
Ebook
AI Crash Course: A fun and hands-on introduction to machine learning, reinforcement learning, deep learning, and artificial intelligence with Python
byHadelin de Ponteves
Rating: 0 out of 5 stars
0 ratings
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
Ebook
Machine Learning for Beginners: An Introduction for Beginners, Why Machine Learning Matters Today and How Machine Learning Networks, Algorithms, Concepts and Neural Networks Really Work
bySteven Cooper
Rating: 4 out of 5 stars
4/5
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
Ebook
Mastering ChatGPT: 21 Prompts Templates for Effortless Writing
byCea West
Rating: 5 out of 5 stars
5/5
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
Ebook
Creating Online Courses with ChatGPT | A Step-by-Step Guide with Prompt Templates
byCea West
Rating: 4 out of 5 stars
4/5
Computer Science: A Concise Introduction
Ebook
Computer Science: A Concise Introduction
byIan Sinclair
Rating: 4 out of 5 stars
4/5
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
Ebook
How to Create Cpn Numbers the Right way: A Step by Step Guide to Creating cpn Numbers Legally
byAlex Parkinson
Rating: 4 out of 5 stars
4/5
YouTube: How to Build and Optimize Your First YouTube Channel, Marketing, SEO, Tips and Strategies for YouTube Channel Success
Ebook
YouTube: How to Build and Optimize Your First YouTube Channel, Marketing, SEO, Tips and Strategies for YouTube Channel Success
byTommy Swindali
Rating: 4 out of 5 stars
4/5
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
Ebook
Excel Essentials: A Step-by-Step Guide with Pictures for Absolute Beginners to Master the Basics and Start Using Excel with Confidence
byNigel Tillery
Rating: 0 out of 5 stars
0 ratings
Python Machine Learning By Example
Ebook
Python Machine Learning By Example
byYuxi (Hayden) Liu
Rating: 4 out of 5 stars
4/5
Practical Lock Picking: A Physical Penetration Tester's Training Guide
Ebook
Practical Lock Picking: A Physical Penetration Tester's Training Guide
byDeviant Ollam
Rating: 5 out of 5 stars
5/5
Deep Search: How to Explore the Internet More Effectively
Ebook
Deep Search: How to Explore the Internet More Effectively
byAlan Pearce
Rating: 5 out of 5 stars
5/5
The Insider's Guide to Technical Writing
Ebook
The Insider's Guide to Technical Writing
byKrista Van Laan
Rating: 0 out of 5 stars
0 ratings
Grokking Algorithms: An illustrated guide for programmers and other curious people
Ebook
Grokking Algorithms: An illustrated guide for programmers and other curious people
byAditya Bhargava
Rating: 4 out of 5 stars
4/5
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
Ebook
SQL QuickStart Guide: The Simplified Beginner's Guide to Managing, Analyzing, and Manipulating Data With SQL
byWalter Shields
Rating: 4 out of 5 stars
4/5
Elon Musk
Ebook
Elon Musk
byWalter Isaacson
Rating: 4 out of 5 stars
4/5
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
Ebook
Procreate for Beginners: Introduction to Procreate for Drawing and Illustrating on the iPad
byAaron Smith
Rating: 0 out of 5 stars
0 ratings
The Mega Box: The Ultimate Guide to the Best Free Resources on the Internet
Ebook
The Mega Box: The Ultimate Guide to the Best Free Resources on the Internet
byChris Mason
Rating: 4 out of 5 stars
4/5
CompTIA Security+ Practice Questions
Ebook
CompTIA Security+ Practice Questions
byIP Specialist
Rating: 2 out of 5 stars
2/5
The Professional Voiceover Handbook: Voiceover training, #1
Ebook
The Professional Voiceover Handbook: Voiceover training, #1
byPeter Baker
Rating: 5 out of 5 stars
5/5
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
Ebook
Everybody Lies: Big Data, New Data, and What the Internet Can Tell Us About Who We Really Are
bySeth Stephens-Davidowitz
Rating: 4 out of 5 stars
4/5
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
Ebook
CompTIA Certification: The Ultimate Guide To Discover CompTIA. Certified Quickly And Easily Passing The Certification Exam. Real Practice Test With Detailed Screenshots, Answers And Explanations
byDavid Mayer
Rating: 0 out of 5 stars
0 ratings
The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling
Ebook
The Data Warehouse Toolkit: The Definitive Guide to Dimensional Modeling
byRalph Kimball
Rating: 0 out of 5 stars
0 ratings
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
Ebook
ChatGPT Ultimate User Guide - How to Make Money Online Faster and More Precise Using AI Technology
byMaximus Wilson
Rating: 0 out of 5 stars
0 ratings
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
Ebook
The ChatGPT Millionaire Handbook: Make Money Online With the Power of AI Technology
byTJ Books
Rating: 0 out of 5 stars
0 ratings
Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands
Ebook
Ultimate Guide to Mastering Command Blocks!: Minecraft Keys to Unlocking Secret Commands
byTriumph Books
Rating: 5 out of 5 stars
5/5
User Friendly: How the Hidden Rules of Design Are Changing the Way We Live, Work, and Play
Ebook
User Friendly: How the Hidden Rules of Design Are Changing the Way We Live, Work, and Play
byCliff Kuang
Rating: 4 out of 5 stars
4/5
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
Ebook
Standard Deviations: Flawed Assumptions, Tortured Data, and Other Ways to Lie with Statistics
byGary Smith
Rating: 4 out of 5 stars
4/5
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
Ebook
Python for Beginners. A Smarter Way to Learn Python in 5 Days and Remember it Longer. With Easy Step by Step Guidance and Hands on Examples. (Python Crash Course-Programming for Beginners)
byArthur T. Brooks
Rating: 0 out of 5 stars
0 ratings
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
Ebook
101 Awesome Builds: Minecraft® Secrets from the World's Greatest Crafters
byTriumph Books
Rating: 4 out of 5 stars
4/5

Related podcast episodes

Skip carousel

S2 E19 - Dynamic Module + Component Loading Using any Observables: Single Page Applications (SPA) have many advantages, including increased interactivity, responsiveness, and user experience. However, a SPA often requires sending large chunks of JavaScript code to the client. This code must be downloaded and parsed...
Podcast episode
S2 E19 - Dynamic Module + Component Loading Using any Observables: Single Page Applications (SPA) have many advantages, including increased interactivity, responsiveness, and user experience. However, a SPA often requires sending large chunks of JavaScript code to the client. This code must be downloaded and parsed...
byThe Angular Plus Show
0 ratings
0% found this document useful
#608: Generative AI Roundup - August 2023: Simon takes you on a tour of your GenAI options. From software development, to AI policy, to trialli
Podcast episode
#608: Generative AI Roundup - August 2023: Simon takes you on a tour of your GenAI options. From software development, to AI policy, to trialli
byAWS Podcast
0 ratings
0% found this document useful
Episode 270: JSJ 269 Reusable React and JavaScript Components with Cory House
Podcast episode
Episode 270: JSJ 269 Reusable React and JavaScript Components with Cory House
byJavaScript Jabber
0 ratings
0% found this document useful
It’s like a HeatWave, Burning in my Heart with Nipun Agarwal: Nipun Agarwal has been a guest before on “Screaming,” but now he has graduated up to Senior VP at Oracle! Which means that Corey can throw harder questions Nipun's direction. Now that Nipun is a SVP, he is well suited to take them on, and we’re the better
Podcast episode
It’s like a HeatWave, Burning in my Heart with Nipun Agarwal: Nipun Agarwal has been a guest before on “Screaming,” but now he has graduated up to Senior VP at Oracle! Which means that Corey can throw harder questions Nipun's direction. Now that Nipun is a SVP, he is well suited to take them on, and we’re the better
byScreaming in the Cloud
0 ratings
0% found this document useful
Hasty Treat - Effortless Custom GraphQL with GraphQL Codegen: In this Hasty Treat, Scott and Wes talk about GraphQL tooling, and specifically a couple tools we use that will change your experience with GraphQL. .TECH Domains - Sponsor .TECH is taking the tech industry by storm. A domain that shows the world...
Podcast episode
Hasty Treat - Effortless Custom GraphQL with GraphQL Codegen: In this Hasty Treat, Scott and Wes talk about GraphQL tooling, and specifically a couple tools we use that will change your experience with GraphQL. .TECH Domains - Sponsor .TECH is taking the tech industry by storm. A domain that shows the world...
bySyntax - Tasty Web Development Treats
0 ratings
0% found this document useful
Fast Stream Processing In Python Using Faust with Ask Solem: Fast Stream Processing On Kafka In Python With Faust (Interview)
Podcast episode
Fast Stream Processing In Python Using Faust with Ask Solem: Fast Stream Processing On Kafka In Python With Faust (Interview)
byThe Python Podcast.__init__
0 ratings
0% found this document useful
Acorns for AWS
Podcast episode
Acorns for AWS
byThe Cloudcast
0 ratings
0% found this document useful
Composable Data Analytics
Podcast episode
Composable Data Analytics
byThe Cloudcast
0 ratings
0% found this document useful
Managing SAP to the Cloud
Podcast episode
Managing SAP to the Cloud
byThe Cloudcast
0 ratings
0% found this document useful
Oracle Data Lakehouse: With each passing day, more and more data sources are sending greater volumes of data across the globe. For any organization, this combination of structured and unstructured data continues to be a challenge. Data lakehouses link, correlate, and...
Podcast episode
Oracle Data Lakehouse: With each passing day, more and more data sources are sending greater volumes of data across the globe. For any organization, this combination of structured and unstructured data continues to be a challenge. Data lakehouses link, correlate, and...
byOracle University Podcast
0 ratings
0% found this document useful
A “API” Look Ahead for 2020
Podcast episode
A “API” Look Ahead for 2020
byThe Cloudcast
0 ratings
0% found this document useful
Serverless Data APIs
Podcast episode
Serverless Data APIs
byThe Cloudcast
0 ratings
0% found this document useful
A day late and a module short: WordPress 6.0 Beta 2 is now available for testing. The call continues to be made for testers and if you would like to contribute to this cycle jump over to make.wordpress.org. The WordPress Performance Group has a stable release of their plugin. The Performance Lab plugin is a collection of modules focused on enhancing performance of your site. This plugin allows you to enable and test the modules before they become available in WordPress core. Other news from the WordPress Performance team…the WebP by Default proposal is currently on hold after the community voiced critical feedback and significant technical concerns. Over on make.wordpress.org Phi Phan Launched a Separator Block With an Icon Option. Justin Tadlock covers the options of this plugin in his article over on the WPTavern. From Our Contributors and Producers Open source Calendly rival Cal.com has raised $25 million in a series A round of funding and launched what it calls an “a
Podcast episode
A day late and a module short: WordPress 6.0 Beta 2 is now available for testing. The call continues to be made for testers and if you would like to contribute to this cycle jump over to make.wordpress.org. The WordPress Performance Group has a stable release of their plugin. The Performance Lab plugin is a collection of modules focused on enhancing performance of your site. This plugin allows you to enable and test the modules before they become available in WordPress core. Other news from the WordPress Performance team…the WebP by Default proposal is currently on hold after the community voiced critical feedback and significant technical concerns. Over on make.wordpress.org Phi Phan Launched a Separator Block With an Icon Option. Justin Tadlock covers the options of this plugin in his article over on the WPTavern. From Our Contributors and Producers Open source Calendly rival Cal.com has raised $25 million in a series A round of funding and launched what it calls an “a
byThe WP Minute - WordPress news
0 ratings
0% found this document useful
Episode 448: RR 440: Swagger and OpenAPI with Josh Ponelat
Podcast episode
Episode 448: RR 440: Swagger and OpenAPI with Josh Ponelat
byRuby Rogues
0 ratings
0% found this document useful
Understanding Time-Series Database Patterns
Podcast episode
Understanding Time-Series Database Patterns
byThe Cloudcast
0 ratings
0% found this document useful
The AI Email Assistant I've Been Waiting for, with Andrew Lee of Shortwave: In this episode, Nathan sits down with Andrew Lee, founder of Shortwave, an AI-powered email app that describes itself as “the smartest email app on planet earth.” They discuss Shortwave’s RAG stack, how the AI assistant works, the models Shortwave is using, and much more. This is truly a masterclass in RAG app development.
Podcast episode
The AI Email Assistant I've Been Waiting for, with Andrew Lee of Shortwave: In this episode, Nathan sits down with Andrew Lee, founder of Shortwave, an AI-powered email app that describes itself as “the smartest email app on planet earth.” They discuss Shortwave’s RAG stack, how the AI assistant works, the models Shortwave is using, and much more. This is truly a masterclass in RAG app development.
by"The Cognitive Revolution" | AI Builders, Researchers, and Live Player Analysis
0 ratings
0% found this document useful
Episode 14. Optimizing for Performance - The tools: In the second part of optimization, we talk about the tools (or more properly algorithms) that you can use to optimize a piece of code. Ever wonder how to make a piece of code faster? or difference between caching and Divide-and-Conquer? then tune in!...
Podcast episode
Episode 14. Optimizing for Performance - The tools: In the second part of optimization, we talk about the tools (or more properly algorithms) that you can use to optimize a piece of code. Ever wonder how to make a piece of code faster? or difference between caching and Divide-and-Conquer? then tune in!...
byJava Pub House
0 ratings
0% found this document useful
Jay Ashe from Cava - Elixir in Production: We talk with Jay Ashe from Cava about their current and past Elixir projects and how they are deployed.
Podcast episode
Jay Ashe from Cava - Elixir in Production: We talk with Jay Ashe from Cava about their current and past Elixir projects and how they are deployed.
byElixir Wizards
0 ratings
0% found this document useful
WBSP418: Grow Your Business by Understanding Oracle JD Edwards' Capabilities, an Objective Panel Discussion
Podcast episode
WBSP418: Grow Your Business by Understanding Oracle JD Edwards' Capabilities, an Objective Panel Discussion
byWBSRocks: Business Growth with ERP and Digital Transformation
0 ratings
0% found this document useful
Managing Oracle Database with REST APIs and ADB Built-in Tools: In this episode, Lois Houston and Nikita Abraham are joined by Cloud Engineer Nick Commisso to talk about managing Oracle Database with REST APIs. They also look at Autonomous Database built-in tools, which are pre-assembled, pre-configured,...
Podcast episode
Managing Oracle Database with REST APIs and ADB Built-in Tools: In this episode, Lois Houston and Nikita Abraham are joined by Cloud Engineer Nick Commisso to talk about managing Oracle Database with REST APIs. They also look at Autonomous Database built-in tools, which are pre-assembled, pre-configured,...
byOracle University Podcast
0 ratings
0% found this document useful
Autonomous Database Tools: In this episode, hosts Lois Houston and Nikita Abraham speak with Oracle Database experts about the various tools you can use with Autonomous Database, including Oracle Application Express (APEX), Oracle Machine Learning, and more. Oracle...
Podcast episode
Autonomous Database Tools: In this episode, hosts Lois Houston and Nikita Abraham speak with Oracle Database experts about the various tools you can use with Autonomous Database, including Oracle Application Express (APEX), Oracle Machine Learning, and more. Oracle...
byOracle University Podcast
0 ratings
0% found this document useful
Revolutionizing Big Data Apps
Podcast episode
Revolutionizing Big Data Apps
byThe Cloudcast
0 ratings
0% found this document useful
DevOps and Incident Response Evolution
Podcast episode
DevOps and Incident Response Evolution
byThe Cloudcast
0 ratings
0% found this document useful
Build A Data Lake For Your Security Logs With Scanner: Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. The majority of products that are available either require too much effort to structure the logs, or aren't fast enough for interactive use cases. Cliff Crosland co-founded Scanner to provide fast querying of high scale log data for security auditing. In this episode he shares the story of how it got started, how it works, and how you can get started with it.
Podcast episode
Build A Data Lake For Your Security Logs With Scanner: Monitoring and auditing IT systems for security events requires the ability to quickly analyze massive volumes of unstructured log data. The majority of products that are available either require too much effort to structure the logs, or aren't fast enough for interactive use cases. Cliff Crosland co-founded Scanner to provide fast querying of high scale log data for security auditing. In this episode he shares the story of how it got started, how it works, and how you can get started with it.
byData Engineering Podcast
0 ratings
0% found this document useful
The Web3 Connectivity Problem | API3 | Dave Connor & Ashar Shahid | Polygon Alpha Podcast
Podcast episode
The Web3 Connectivity Problem | API3 | Dave Connor & Ashar Shahid | Polygon Alpha Podcast
byPolygon Alpha Podcast
0 ratings
0% found this document useful
Troubleshooting Kafka In Production: Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Operating it at scale, however, is notoriously challenging. Elad Eldor has experienced these challenges first-hand, leading to his work writing the book "Kafka: Troubleshooting in Production". In this episode he highlights the sources of complexity that contribute to Kafka's operational difficulties, and some of the main ways to identify and mitigate potential sources of trouble.
Podcast episode
Troubleshooting Kafka In Production: Kafka has become a ubiquitous technology, offering a simple method for coordinating events and data across different systems. Operating it at scale, however, is notoriously challenging. Elad Eldor has experienced these challenges first-hand, leading to his work writing the book "Kafka: Troubleshooting in Production". In this episode he highlights the sources of complexity that contribute to Kafka's operational difficulties, and some of the main ways to identify and mitigate potential sources of trouble.
byData Engineering Podcast
0 ratings
0% found this document useful
Tackling Real Time Streaming Data With SQL Using RisingWave: Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable.
Podcast episode
Tackling Real Time Streaming Data With SQL Using RisingWave: Stream processing systems have long been built with a code-first design, adding SQL as a layer on top of the existing framework. RisingWave is a database engine that was created specifically for stream processing, with S3 as the storage layer. In this episode Yingjun Wu explains how it is architected to power analytical workflows on continuous data flows, and the challenges of making it responsive and scalable.
byData Engineering Podcast
0 ratings
0% found this document useful
#27: Serverless at A Cloud Guru with Dale Salter
Podcast episode
#27: Serverless at A Cloud Guru with Dale Salter
byReal World Serverless with theburningmonk
0 ratings
0% found this document useful
Designing A Non-Relational Database Engine: Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing a non-relational database.
Podcast episode
Designing A Non-Relational Database Engine: Databases come in a variety of formats for different use cases. The default association with the term "database" is relational engines, but non-relational engines are also used quite widely. In this episode Oren Eini, CEO and creator of RavenDB, explores the nuances of relational vs. non-relational engines, and the strategies for designing a non-relational database.
byData Engineering Podcast
0 ratings
0% found this document useful
#06 - Tech stack of Open Podcast: Which database is best?
Podcast episode
#06 - Tech stack of Open Podcast: Which database is best?
byTOPP - The Open Podcast Podcast
0 ratings
0% found this document useful

Skip carousel

Accurate, Open Source IP-based Localisation
Linux Format
Article
Accurate, Open Source IP-based Localisation
Dec 14, 2021
8 min read
Set Up A Production- Ready Web Server
APC
Article
Set Up A Production- Ready Web Server
Nov 4, 2019
8 min read
FLASK Web Frameworks
Linux Format
Article
FLASK Web Frameworks
Jun 4, 2019
The main focus of Python has always been to get you cracking on with your coding – the language was never made for web programming. However, this has just made it more interesting to extend the language for the web, or to create an interface to web-b
9 min read
Rediscover Speed With The Redis Revolution
Linux Format
Article
Rediscover Speed With The Redis Revolution
Jul 25, 2023
Credit: https://redis.io Redis is an open-source, in-memory data structure store that has gained popularity R as a highly efficient caching and messaging system. It prioritises speed, efficiency and versatility, making it a top choice for various ap
8 min read
Set Up A Production-ready Web Server
Linux Format
Article
Set Up A Production-ready Web Server
Sep 24, 2019
8 min read
Join the Pod, Man!
Linux Format
Article
Join the Pod, Man!
May 30, 2023
8 min read
Code An Admin Back-end In Django
Linux Format
Article
Code An Admin Back-end In Django
Dec 13, 2022
Credit: www.djangoproject.com OUR EXPERT Matt Holder has been a fan of the open source methodology for over two decades and uses Linux and other tools where possible. More featurepacked source code for this project can be downloaded from https://
6 min read
Code A Cataloguing Application In Python
Linux Format
Article
Code A Cataloguing Application In Python
Nov 15, 2022
Credit: www.djangoproject.com Matt Holder has been a fan of the open source methodology for over two decades and uses Linux and other tools where possible. More featurepacked source code for this project can be downloaded from https://github.com/mat
8 min read
How To Build The Linux Format Server
Linux Format
Article
How To Build The Linux Format Server
Oct 19, 2021
10 min read
A.I.-POWERED RASPBERRY Pi
Linux Format
Article
A.I.-POWERED RASPBERRY Pi
Sep 19, 2023
1 min read
Observability Of The Kernel And Containers
Linux Format
Article
Observability Of The Kernel And Containers
Apr 4, 2023
Mihalis Tsoukalos is currently working on Time Series. You can reach him at: @mactsouk. For our final delve into eBPF, we’re tackling applications, the kernel and Docker containers. At the end of the day, all Linux machines execute code for applicat
10 min read
Manipulate Data Like A Pro With Pandas
Linux Format
Article
Manipulate Data Like A Pro With Pandas
Jul 27, 2021
7 min read
5 Safari Alternatives Worth Trying
Macworld UK
Article
5 Safari Alternatives Worth Trying
Mar 13, 2020
4 min read
Safari Alternatives: Five Mac Web Browsers Worth Trying
MacWorld
Article
Safari Alternatives: Five Mac Web Browsers Worth Trying
Mar 17, 2020
4 min read
Lag Is Killing Games
Linux Format
Article
Lag Is Killing Games
Jan 11, 2022
8 min read
SolarWinds Network Performance Monitor 2022.4
PC Pro Magazine
Article
SolarWinds Network Performance Monitor 2022.4
Feb 9, 2023
2 min read
Build Your Own URL Shortening Service
Linux Format
Article
Build Your Own URL Shortening Service
May 4, 2021
7 min read
How We Tested…
Linux Format
Article
How We Tested…
May 4, 2021
Web browsers are among the most intuitive applications so we won’t test them on available documentation, even though this is an area where Falkon is severely lacking. As always, we want a feature-rich browser that can suitably replace the default off
1 min read
Scan And Scrape Websites Using Python
Linux Format
Article
Scan And Scrape Websites Using Python
Nov 14, 2023
David Bolton once accidentally boosted the traffic for his firm’s website by 25% in one day by running a web scraper on it. Luckily, they never found out! Ever since the web made an appearance back in the mid-’90s, programmers have been writing softw
6 min read
Yarp: A Similar Framework
Linux Format
Article
Yarp: A Similar Framework
Jan 12, 2021
YARP is the framework for communications within robotics. It can replace the ROS master as a name server. You can also do this the other way around using the YARP as a name server. The name server will support the nodes and protocols across your syst
1 min read
Liz Rice Chief Open Source Officer at Isovalent
Techfastly
Article
Liz Rice Chief Open Source Officer at Isovalent
Apr 1, 2022
5 min read
The Network NAS appliances 2024
PC Pro Magazine
Article
The Network NAS appliances 2024
Apr 4, 2024
4 min read
Even If You Love Safari, Here Are 5 Reasons To Try A New Browser On Your Mac
MacWorld
Article
Even If You Love Safari, Here Are 5 Reasons To Try A New Browser On Your Mac
Oct 18, 2022
3 min read
Build A Dynamic App Security Pipeline
Linux Format
Article
Build A Dynamic App Security Pipeline
Sep 21, 2021
8 min read
5 Tools That Integrate Your Cloud Storage Into Windows File Explorer
PCWorld
Article
5 Tools That Integrate Your Cloud Storage Into Windows File Explorer
Apr 30, 2024
6 min read
MacOS High Sierra: The New Safari Takes Steps to Reduce Persistent User Tracking, but Is It Enough?
MacWorld
Article
MacOS High Sierra: The New Safari Takes Steps to Reduce Persistent User Tracking, but Is It Enough?
Aug 18, 2017
5 min read
How To Setup A Killer Wensite In 2022
PC Pro Magazine
Article
How To Setup A Killer Wensite In 2022
Jan 6, 2022
8 min read
Applied Acoustic Systems Multiphonics CV-1 $99.00
Computer Music
Article
Applied Acoustic Systems Multiphonics CV-1 $99.00
Mar 23, 2022
While there is nothing new about modular synthesis, the undeniable rise of the Eurorack format in more recent years has influenced a number of companies to investigate the concept by creating a modular in software. This desktop-based format has found
4 min read
Disrupting Databases
APC
Article
Disrupting Databases
Apr 19, 2021
3 min read
Twenty Years Of WordPress Websites!
Linux Format
Article
Twenty Years Of WordPress Websites!
Oct 17, 2023
11 min read

Related categories

Skip carousel

Reviews for Scaling Apache Solr

Rating: 0 out of 5 stars

0 ratings

0 ratings0 reviews

Book preview

Scaling Apache Solr - Hrishikesh Vijay Karambelkar

Scaling Apache Solr

Credits

About the Author

About the Reviewers

www.PacktPub.com

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Reader feedback

Customer support

Downloading the example code

Errata

Piracy

Questions

1. Understanding Apache Solr

Challenges in enterprise search

Apache Solr – an overview

Features of Apache Solr

Solr for end users

Powerful full text search

Search through rich information

Results ranking, pagination, and sorting

Facets for better browsing experience

Advanced search capabilities

Administration

Apache Solr architecture

Storage

Solr application

Integration

Client APIs and SolrJ client

Other interfaces

Practical use cases for Apache Solr

Enterprise search for a job search agency

Problem statement

Approach

Enterprise search for energy industry

Problem statement

Approach

Summary

2. Getting Started with Apache Solr

Setting up Apache Solr

Prerequisites

Running Solr on Jetty

Running Solr on Tomcat

Solr administration

What's next?

Common problems and solution

Understanding the Solr structure

The Solr home directory structure

Solr navigation

Configuring the Apache Solr for enterprise

Defining a Solr schema

Solr fields

Dynamic Fields in Solr

Copying the fields

Field types

Other important elements in the Solr schema

Configuring Solr parameters

solr.xml and Solr core

solrconfig.xml

The Solr plugin

Other configurations

Understanding SolrJ

Summary

3. Analyzing Data with Apache Solr

Understanding enterprise data

Categorizing by characteristics

Categorizing by access pattern

Categorizing by data formats

Loading data using native handlers

Quick and simple data loading – post tool

Working with JSON, XML, and CSV

Handling JSON data

Working with CSV data

Working with XML data

Working with rich documents

Understanding Apache Tika

Using Solr Cell (ExtractingRequestHandler)

Adding metadata to your rich documents

Importing structured data from the database

Configuring the data source

Importing data in Solr

Full import

Delta import

Loading RDBMS tables in Solr

Advanced topics with Solr

Deduplication

Extracting information from scanned documents

Searching through images using LIRE

Summary

4. Designing Enterprise Search

Designing aspects for enterprise search

Identifying requirements

Matching user expectations through relevance

Access to searched entities and user interface

Improving search performance and ensuring instance scalability

Working with applications through federated search

Other differentiators – mobiles, linguistic search, and security

Enterprise search data-processing patterns

Standalone search engine server

Distributed enterprise search pattern

The replicated enterprise search pattern

Distributed and replicated

Data integrating pattern for search

Data import by enterprise search

Applications pushing data

Middleware-based integration

Case study – designing an enterprise knowledge repository search for software IT services

Gathering requirements

Designing the solution

Designing the schema

Integrating subsystems with Apache Solr

Working on end user interface

Summary

5. Integrating Apache Solr

Empowering the Java Enterprise application with Solr search

Embedding Apache Solr as a module (web application) in an enterprise application

How to do it?

Apache Solr in your web application

How to do it?

Integration with client technologies

Integrating Apache Solr with PHP for web portals

Interacting directly with Solr

Using the Solr PHP client

How to do it?

Advanced integration with Solarium

How to do it?

Integrating Apache Solr with JavaScript

Using simple XMLHTTPRequest

Integrating Apache Solr using AJAX Solr

Parsing Solr XML with the help of XSLT

Case study – Apache Solr and Drupal

How to do it?

Summary

6. Distributed Search Using Apache Solr

Need for distributed search

Distributed search architecture

Apache Solr and distributed search

Understanding SolrCloud

Why Zookeeper?

SolrCloud architecture

Building enterprise distributed search using SolrCloud

Setting up a SolrCloud for development

Setting up a SolrCloud for production

Adding a document to SolrCloud

Creating shards, collections, and replicas in SolrCloud

Common problems and resolutions

Case study – distributed enterprise search server for the software industry

Summary

7. Scaling Solr through Sharding, Fault Tolerance, and Integration

Enabling search result clustering with Carrot2

Why Carrot2?

Enabling Carrot2-based document clustering

Understanding Carrot2 result clustering

Viewing Solr results in the Carrot2 workbench

FAQs and problems

Sharding and fault tolerance

Document routing and sharding

Shard splitting

Load balancing and fault tolerance in SolrCloud

Searching Solr documents in near real time

Strategies for near real-time search in Apache Solr

Explicit call to commit from a client

solrconfig.xml – autocommit

CommitWithin – delegating the responsibility to Solr

Real-time search in Apache Solr

Solr with MongoDB

Understanding MongoDB

Installing MongoDB

Creating Solr indexes from MongoDB

Scaling Solr through Storm

Getting along with Apache Storm

Solr and Apache Storm

Summary

8. Scaling Solr through High Performance

Monitoring performance of Apache Solr

What should be monitored?

Hardware and operating system

Java virtual machine

Apache Solr search runtime

Apache Solr indexing time

SolrCloud

Tools for monitoring Solr performance

Solr administration user interface

JConsole

SolrMeter

Tuning Solr JVM and container

Deciding heap size

How can we optimize JVM?

Optimizing JVM container

Optimizing Solr schema and indexing

Stored fields

Indexed fields and field lengths

Copy fields and dynamic fields

Fields for range queries

Index field updates

Synonyms, stemming, and stopwords

Tuning DataImportHandler

Speeding up index generation

Committing the change

Limiting indexing buffer size

SolrJ implementation classes

Speeding Solr through Solr caching

The filter cache

The query result cache

The document cache

The field value cache

The warming up cache

Improving runtime search for Solr

Pagination

Reducing Solr response footprint

Using filter queries

Search query and the parsers

Lazy field loading

Optimizing SolrCloud

Summary

9. Solr and Cloud Computing

Enterprise search on Cloud

Models of engagement

Enterprise search Cloud deployment models

Solr on Cloud strategies

Scaling Solr with a dedicated application

Advantages

Disadvantages

Scaling Solr horizontal as multiple applications

Advantages

Disadvantages

Scaling horizontally through the Solr multicore

Scaling horizontally with replication

Scaling horizontally with Zookeeper

Advantages

Disadvantages

Running Solr on Cloud (IaaS and PaaS)

Running Solr with Amazon Cloud

Running Solr on Windows Azure

Running Solr on Cloud (SaaS) and enterprise search as a service

Running Solr with OpenSolr Cloud

Running Solr with SolrHQ Cloud

Running Solr with Bitnami

Working with Amazon CloudSearch

Drupal-Solr SaaS with Acquia

Summary

10. Scaling Solr Capabilities with Big Data

Apache Solr and HDFS

Big Data search on Katta

How Katta works?

Setting up Katta cluster

Creating Katta indexes

Using the Solr 1045 patch – map-side indexing

Using the Solr 1301 patch – reduce-side indexing

Apache Solr and Cassandra

Working with Cassandra and Solr

Single node configuration

Integrating with multinode Cassandra

Advanced analytics with Solr

Integrating Solr and R

Summary

A. Sample Configuration for Apache Solr

schema.xml

solrconfig.xml

spellings.txt

synonyms.txt

protwords.txt

stopwords.txt

Index

Scaling Apache Solr

All rights reserved. No part of this book may be reproduced, stored in a retrieval system, or transmitted in any form or by any means, without the prior written permission of the publisher, except in the case of brief quotations embedded in critical articles or reviews.

Every effort has been made in the preparation of this book to ensure the accuracy of the information presented. However, the information contained in this book is sold without warranty, either express or implied. Neither the author, nor Packt Publishing, and its dealers and distributors will be held liable for any damages caused or alleged to be caused directly or indirectly by this book.

Packt Publishing has endeavored to provide trademark information about all of the companies and products mentioned in this book by the appropriate use of capitals. However, Packt Publishing cannot guarantee the accuracy of this information.

First published: July 2014

Production reference: 1180714

Published by Packt Publishing Ltd.

Livery Place

35 Livery Street

Birmingham B3 2PB, UK.

ISBN 978-1-78398-174-8

www.packtpub.com

Cover image by Maria Cristina Caggiani (<mariacristinacaggiani@virgilio.it>)

Credits

Author

Hrishikesh Vijay Karambelkar

Reviewers

Ramzi Alqrainy

Aamir Hussain

Ruben Teijeiro

Nick Veenhof

Commissioning Editor

Andrew Duckworth

Acquisition Editor

Neha Nagwekar

Content Development Editor

Neil Alexander

Technical Editors

Monica John

Shiny Poojary

Copy Editors

Sayanee Mukherjee

Karuna Narayanan

Adithi Shetty

Laxmi Subramanian

Project Coordinators

Sanchita Mandal

Kartik Vedam

Proofreaders

Mario Cecere

Paul Hindle

Jonathan Todd

Indexers

Hemangini Bari

Rekha Nair

Tejal Soni

Priya Subramani

Graphics

Abhinash Sahu

Production Coordinator

Nilesh R. Mohite

Cover Work

Nilesh R. Mohite

About the Author

Hrishikesh Vijay Karambelkar is an enterprise architect with a blend of technical and entrepreneurial experience of more than 13 years. His core expertise involves working with multiple topics that include J2EE, Big Data, Solr, Link Analysis, and Analytics. He enjoys architecting solutions for the next generation of product development for IT organizations. He spends most of his work time now solving challenging problems faced by the software industry.

In the past, he has worked in the domains of graph databases; some of his work has been published in international conferences, such as VLDB, ICDE, and so on. He has recently published a book called Scaling Big Data with Hadoop and Solr, Packt Publishing. He enjoys spending his leisure time traveling, trekking, and photographing birds in the dense forests of India. He can be reached at http://hrishikesh.karambelkar.co.in/.

I am thankful to Packt Publishing for providing me with the opportunity to share my experiences on this topic. I would also like to thank the reviewers of this book, especially Nick Veenhof from Acquira for shaping it up; the Packt Publishing team, Kartik Vedam and Neil Alexander, for their support during bad times; and my dear wife Dhanashree, who stood by me all the time.

About the Reviewers

Ramzi Alqrainy is one of the most recognized experts within Artificial Intelligence and Information Retrieval fields in the Middle East. He is an active researcher and technology blogger, with a focus on information retrieval.

He is currently managing the search and reporting functions at OpenSooq.com where he capitalizes on his solid experience in open source technologies in scaling up the search engine and supportive systems.

His solid experience in Solr, ElasticSearch, Mahout, and Hadoop stack contributed directly to the business growth through the implementations and projects that helped the key people at OpenSooq to slice and dice information easily throughout the dashboards and data visualization solutions.

By developing more than six full-stack search engines, he was able to solve many complicated challenges about agglutination and stemming in the Arabic language.

He holds a Master's degree in Computer Science. He was among the top three in his class and was listed on the honor roll. His website address is http://ramzialqrainy.com. His LinkedIn profile is http://www.linkedin.com/in/ramzialqrainy. He can be contacted at .

Aamir Hussain is a well-versed software design engineer with more than 5 years of experience. He excels in solving problems that involve Breadth. He has gained expert internal knowledge by developing software that are used by millions of users. He developed complex software systems using Python, Django, Solr, MySQL, mongoDB, HTML, JavaScript, CSS, and many more open source technologies. He is determined to get a top-quality job done by continually learning new technologies.

His other experiences include analyzing and designing requirements, WEB2 and new technologies, content management, service management that includes fixing problems, change control and management, software release, software testing, service design, service strategy, and continual service improvement. His specialties include complex problem solving, web portal architecture, talent acquisition, team building, and team management.

Ruben Teijeiro is an experienced frontend and backend web developer who has worked with several PHP frameworks for more than a decade. He is now using his expertise to focus on Drupal; in fact, he collaborated with them during the development of several projects for some important companies, such as UNICEF and Telefonica in Spain, and Ericsson in Sweden.

As an active member of the Drupal community, you can find him contributing to Drupal core, helping and mentoring other contributors and speaking at Drupal events around the world. He also loves to share all that he has learned through his blog (http://drewpull.com).

I would like to thank my parents for their help and support, since I had my first computer when I was eight. I would also like to thank my lovely wife for her patience during all these years of geeking and traveling.

Nick Veenhof is a Belgian inhabitant who has lived in a couple of different countries, but has recently moved back home. He is currently employed at Acquia as a Lead Search Engineer. He also helps in maintaining the Drupal Apache Solr Integration module and also actively maintains and develops new infrastructure tools for deploying Drupal sites on Acquia Cloud with Apache Solr.

He is also advocating open source and especially the open source project, Drupal. This software crossed his path during his young university years and became part of his career when he started to use it for professional purposes in several Drupal-focused development shops.

As a logical step, he is also active in the Drupal community, as you can see in his drupal.org profile (https://www.drupal.org/user/122682), and tries to help people online with their first steps.

You can easily find out more about him by searching for his name, Nick Veenhof, or his nickname, Nick_vh, on Google.

www.PacktPub.com

Support files, eBooks, discount offers, and more

You might want to visit www.PacktPub.com for support files and downloads related to your book.

Did you know that Packt offers eBook versions of every book published, with PDF and ePub files available? You can upgrade to the eBook version at www.PacktPub.com and as a print book customer, you are entitled to a discount on the eBook copy. Get in touch with us at for more details.

At www.PacktPub.com, you can also read a collection of free technical articles, sign up for a range of free newsletters and receive exclusive discounts and offers on Packt books and eBooks.

http://PacktLib.PacktPub.com

Do you need instant solutions to your IT questions? PacktLib is Packt's online digital book library. Here, you can access, read and search across Packt's entire library of books.

Why subscribe?

Fully searchable across every book published by Packt

Copy and paste, print and bookmark content

On demand and accessible via web browser

Free access for Packt account holders

If you have an account with Packt at www.PacktPub.com, you can use this to access PacktLib today and view nine entirely free books. Simply use your login credentials for immediate access.

I dedicate my work to my mentors who inspired my life and made me what I am today: my schoolteacher Mrs. Argade; Prof. S. Sudarshan from IIT Bombay; and Danesh Tarapore, Postdoctoral Researcher at Lisbon, Portugal.

Preface

With the growth of information assets in enterprises, the need to build a rich, scalable search application has becomes critical. Today, Apache Solr is one of the most widely adapted, scalable, feature-rich, and high-performance open source search application servers.

Scaling Apache Solr is intended to enable its users to transform Apache Solr from a basic search server, to an enterprise-ready, high performance, and scalable search application. This book is a comprehensive reference guide that starts with the basic concepts of Solr; it takes users through the journey of how Solr can be used in enterprise deployments, and it finally dives into building a highly efficient, scalable enterprise search application using various techniques and numerous practical chapters.

What this book covers

Chapter 1, Understanding Apache Solr, introduces readers to Apache Solr as a search server. It discusses problems addressed by Apache Solr, its architecture, features, and, finally, covers different use cases for Solr.

Chapter 2, Getting Started with Apache Solr, focuses on setting up the first Apache Solr instance. It provides a detailed step-by-step guide for installing and configuring Apache Solr. It also covers connecting to Solr through SolrJ libraries.

Chapter 3, Analyzing Data with Apache Solr, covers some of the common enterprise search problems of bringing information from disparate sources and different information flow patterns to Apache Solr with minimal losses. It also covers advanced topics for Apache Solr such as deduplication, and searching through images.

Chapter 4, Designing Enterprise Search, introduces its readers to the various aspects of designing a search for enterprises. It covers topics pertaining to different data processing patterns of enterprises, integration patterns, and a case study of designing a knowledge repository for the software IT industry.

Chapter 5, Integrating Apache Solr, focuses on the integration aspects of Apache Solr with different applications that are commonly used in any enterprise. It also covers different ways in which Apache Solr can be introduced by the enterprise to its users.

Chapter 6, Distributed Search Using Apache Solr, starts with the need for distributed search for an enterprise, and it provides a deep dive to building distributed search using Apache SolrCloud. Finally, the chapter covers common problems and a case study of distributed search using Apache Solr for the software industry.

Chapter 7, Scaling Solr through Sharding, Fault Tolerance, and Integration, discusses various aspects of enabling enterprise Solr search server to scaling by means of sharding and search result clustering. It covers integration aspects of Solr with other tools such as MongoDB, Carrot2, and Storm to achieve the scalable search solution for an enterprise.

Chapter 8, Scaling Solr through High Performance, provides insights of how Apache Solr can be transformed into a high-performance search engine for an enterprise. It starts with monitoring for performance, and how the Solr application server can be tuned for performing better with maximum utilization of the available sources.

Chapter 9, Solr and Cloud Computing, focuses on enabling Solr on cloud-based infrastructure for scalability. It covers different Solr strategies for cloud, their applications, and deep dive into using Solr in different types of cloud environment.

Chapter 10, Scaling Solr Capabilities with Big Data, covers aspects of working with a very high volume of data (Big Data), and how Solr can be used to work with Big Data. It discusses how Apache Hadoop and its ecosystem can be used to build and run efficient Big Data enterprise search solutions.

Appendix, Sample Configuration for Apache Solr, provides a reference configuration for different Solr configuration files with detailed explanations.

What you need for this book

This book discusses different approaches; each approach needs a different set of software. Based on the requirement of building search applications, the respective software can be used. However, to run a minimal Apache Solr instance, you need the following software:

Latest available JDK

Latest Apache Solr (4.8 onwards)

Who this book is for

Scaling Apache Solr provides step-by-step guidance for any user who intends to build high performance, scalable, enterprise-ready search application servers. This book will appeal to developers, architects, and designers who wish to understand Apache Solr, design the enterprise-ready application, and optimize it based on the requirements. This book enables users to build a scalable search without prior knowledge of Solr with practical examples and case studies.

Conventions

In this book, you will find a number of styles of text that distinguish between different kinds of information. Here are some examples of these styles, and an explanation of their meaning.

Code words in text, database table names, folder names, filenames, file extensions, pathnames, dummy URLs, user input, and Twitter handles are shown as follows: You can simply add the following dependency to your pom.xml file to use SolrJ.

A block of code is set as follows:

CloudSolrServer server = new CloudSolrServer(localhost:9983);

server.setDefaultCollection(collection1);

SolrQuery solrQuery = new SolrQuery(*.*);

Any command-line input or output is written as follows:

$tar –xvzf solr-.tgz$tar –xvzf solr-.tgz

New terms and important words are shown in bold. Words that you see on the screen, in menus or dialog boxes for example, appear in the text like this: The OK status explains that Solr is running fine on the server.

Note

Warnings or important notes appear in a box like this.

Tip

Tips and tricks appear like this.

Reader feedback

Feedback from our readers is always welcome. Let us know what you think about this book—what you liked or may have disliked. Reader feedback is important for us to develop titles that you really get the most out of.

To send us general feedback, simply send an e-mail to <feedback@packtpub.com>, and mention the book title via the subject of your message.

If there is a topic that you have expertise in and you are interested in either writing or contributing to a book, see our author guide on www.packtpub.com/authors.

Customer support

Now that you are the proud owner of a Packt book, we have a number of things to help you to get the most from your purchase.

Downloading the example code

You can download the example code files for all Packt books you have purchased from your account at http://www.packtpub.com. If you purchased this book elsewhere, you can visit http://www.packtpub.com/support and register to have the files e-mailed directly to you.

Errata

Although we have taken every care to ensure the accuracy of our content, mistakes do happen. If you find a mistake in one of our books—maybe a mistake in the text or the code—we would be grateful if you would report this to us. By doing so, you can save other readers from frustration and help us improve subsequent versions of this book. If you find any errata, please report them by visiting http://www.packtpub.com/submit-errata, selecting your book, clicking on the errata submission form link, and entering the details of your errata. Once your errata are verified, your submission will be accepted and the errata will be uploaded on our website, or added to any list of existing errata, under the Errata section of that title. Any existing errata can be viewed by selecting your title from http://www.packtpub.com/support.

Piracy

Piracy of copyright material on the Internet is an ongoing problem across all media. At Packt, we take

Enjoying the preview?

Page 1 of 1

Scaling Apache Solr

About this ebook

Hrishikesh Vijay Karambelkar

Related authors

Related to Scaling Apache Solr

Related ebooks

Computers For You

Related podcast episodes

Related articles

Related categories

Reviews for Scaling Apache Solr

What did you think?

Book preview

Scaling Apache Solr - Hrishikesh Vijay Karambelkar

Table of Contents

Scaling Apache Solr

Scaling Apache Solr

Credits

About the Author

About the Reviewers

Support files, eBooks, discount offers, and more

Why subscribe?

Free access for Packt account holders

Preface

What this book covers

What you need for this book

Who this book is for

Conventions

Note

Tip

Reader feedback

Customer support

Downloading the example code

Errata

Piracy