MassiveSearchBundle

This is the documentation for the MassiveSearchBundle.

Contents

Introduction

The MassiveSearchBundle provides an extensible, localized search abstraction which is concerned primarily with providing “site” search capabilities.

What it does

It allows you to map documents using XML (or a custom driver), index them with a search adapter and search for them. The search “results” (documents) are returned in a format focused on the use case of providing a list of search results on which the user clicks.

For example, a typical use case would be to provide a search results page as follows:

+--------+ Search result 1
|        |
| <img>  | Some description of this result
|        |
+--------+

+--------+ Search result 2
|        |
| <img>  | Some description for search result 2
|        |
+--------+

Just to be clear: it is not designed for anything else.

Quick example

This example will assume you want to index a Product entity using the Doctrine ORM.

Note

The bundle is in no way coupled to the Doctrine ORM, and it is possible to use it with any persistence system.

Enable the Doctrine ORM support in your main configuration:

massive_search:
    persistence:
        doctrine_orm:
            enabled: true

And enable one of the Search Adapters

Create your model in <YourBundle>/Entity/Product.php:

<?php

// <YourBundle>/Entity/Product.php

namespace Acme\YourBundle\Entity\Product;

use Doctrine\ORM\Mapping as ORM;

/**
 * @ORM\Entity
 * @ORM\Table(name="product")
 */
class Product
{
    /**
     * @ORM\Column(type="integer")
     * @ORM\Id
     * @ORM\GeneratedValue(strategy="AUTO")
     */
    protected $id;

    /**
     * @ORM\Column(type="string", length=100)
     */
    protected $name;

    /**
     * @ORM\Column(type="decimal", scale=2)
     */
    protected $price;

    /**
     * @ORM\Column(type="text")
     */
    protected $description;
}

Place the following mapping file in the Resources/config/massive-search/Product.xml:

<!-- /path/to/YourBundle/Resources/config/massive-search/Product.xml -->
<massive-search-mapping xmlns="http://massive.io/schema/dic/massive-search-mapping">

    <mapping class="Model\Product">
        <index name="product" />
        <id property="id" />
        <title property="name" />
        <url expr="'/path/to/' ~ object.id" />
        <description property="body" />
        <image expr="'/assets/images/' ~ object.type" />
        <fields>
            <field name="title" type="string" />
            <field name="body" type="string" />
        </fields>
    </mapping>

</massive-search-mapping>

Now, when you persist your Product with Doctrine ORM it should be automatically indexed by the configured search adapter.

Mapping

The MassiveSearchBundle requires that you define which objects should be indexed through mapping. Currently only XML mapping is natively supported:

<!-- /path/to/YourBundle/Resources/config/massive-search/Product.xml -->
<massive-search-mapping xmlns="http://massiveart.com/schema/dic/massive-search-mapping">

    <mapping class="Massive\Bundle\SearchBundle\Tests\Resources\TestBundle\Entity\Product">
        <index name="product" />
        <id property="id" />
        <fields>
            <field name="title" type="string" />
            <field name="body" type="string" />
        </fields>
    </mapping>

</massive-search-mapping>

This mapping will cause the fields title and body to be indexed into an index named product and the ID obtained from the objects id field.

Mapping elements

The possible mappings are:

  • index: Name of the index in which to insert the record
  • title: Title to use in search results
  • description: A description for the search result
  • url: The URL to which the search resolt should link to
  • image: An image to associate with the search result
  • category: Name of a category string to include in search results
  • fields: List of <field /> elements detailing which fields should be indexed (i.e. used when finding search results).

Each mapping (except index and category which are literal values) can use either a property attribute or an expr attribute. These attributes determine how the value is retrieved. property will use the Symfony PropertyAccess component, and expr will use ExpressionLanguage.

PropertyAccess allows you to access properties of an object by path, e.g. title, or parent.title. The expression allows you to build expressions which can be evaluated, e.g. '/this/is/' ~ object.id ~ '/a/path'.

Fields

Fields dictate which fields are indexed in the underlying search engine.

Mapping:

  • name: Field name
  • property: Object property to map the field
  • expr: Mutually exclusive with property

Types:

  • string: Store as a string
  • complex: Apply mapping to an array of values
Complex mapping

Complex mapping provides a way to map a nested data structure within the subject object.

Note

This feature is not currently supported by the XML driver and is therefore not available unless used in a custom driver.

Expression Language

The MassiveSearchBundle includes its own flavor of the Symfony expression language.

<massive-search-mapping xmlns="http://massiveart.com/schema/dic/massive-search-mapping">
    <mapping class="Massive\Bundle\SearchBundle\Tests\Resources\TestBundle\Entity\Product">
        <!-- ... -->
        <url expr="'/path/to/' ~ article.title'" />
        <!-- ... -->
    </mapping>
</massive-search-mapping>

Functions:

  • join: Maps to the implode function in PHP. e.g. join(',', ["one", "two"]) equals "one,two"
  • map: Maps to the array_map function in PHP. e.g. map([1, 2, 3], 'el + 1') equals array(2, 3, 4).

Localization

You can add a locale mapping which will cause the object to be stored in a localized index (if configured, see Localization).

<!-- /path/to/YourBundle/Resources/config/massive-search/Product.xml -->
<massive-search-mapping xmlns="http://massiveart.com/schema/dic/massive-search-mapping">

    <mapping class="Massive\Bundle\SearchBundle\Tests\Resources\TestBundle\Entity\Product">
        <!-- ... -->
        <locale property="locale" />
        <!-- ... -->
    </mapping>

</massive-search-mapping>

This assumes that the object has a property $locale which contiains the objects current localization code.

If you do not map the locale or the locale is reosolved as NULL then it will be assumed that the object is not localized.

<!-- /path/to/YourBundle/Resources/config/massive-search/Product.xml -->
<massive-search-mapping xmlns="http://massiveart.com/schema/dic/massive-search-mapping">

    <mapping class="Massive\Bundle\SearchBundle\Tests\Resources\TestBundle\Entity\Product">
        <!-- ... -->
        <category name="Massive Products" />
        <!-- ... -->
    </mapping>

</massive-search-mapping>

Full example

The following example uses all the mapping options:

<!-- /path/to/YourBundle/Resources/config/massive-search/Product.xml -->
<massive-search-mapping xmlns="http://massiveart.com/schema/dic/massive-search-mapping">

    <mapping class="Massive\Bundle\SearchBundle\Tests\Resources\TestBundle\Entity\Product">
        <index name="product" />
        <id property="id" />
        <locale property="locale" />
        <title property="title" />
        <url expr="'/path/to/' ~ object.id" />
        <description property="body" />
        <image expr="'/assets/images/' ~ object.type" />
        <category name="My Category" />
        <fields>
            <field name="title" type="string" />
            <field name="body" type="string" />
        </fields>

    </mapping>

</massive-search-mapping>

Note:

  • This file MUST be located in YourBundle/Resources/config/massive-search
  • It must be named after the name of your class (without the namespace) e.g. Product.xml
  • Your Product class MUST be located in one of the following folders: - YourBundle/Document - YourBundle/Entity - YourBundle/Model

Note

It will be possible in the future to specify paths for mapping files.

Note

The bundle automatically removes existing documents with the same ID. The ID mapping is mandatory.

Searching and the Search Manager

The search manager service is the only part of the system which you need to talk to, it provides all the methods you need to manage searching and the search index.

The search manager can be retrieved from the DI container using massive_search.search_manager.

For example:

<?php
$searchManager = $container->get('massive_search.search_manager');

Searching

Currently only supported by query string is supported. The query string is passed directly to the search library:

<?php
$hits = $searchManager->createSearch('My Product')->execute();

foreach ($hits as $hit) s with the following data:
{
    echo $hit->getScore();

    /** @var Massive\Bundle\SearchBundle\Search\Document */
    $document = $hit->getDocument();

    // retrieve the indexed documents "body" field
    $body = $document->getField('body');

    // retrieve the indexed ID of the document
    $id = $document->getId();
}

You can also search in a specific locale and a specific index:

<?php
$hits = $searchManager
  ->createSearch('My Article')
  ->index('article')
  ->locale('fr')
  ->execute();

Or search in multiple indexes:

<?php
$hits = $searchManager
  ->createSearch('My Article')
  ->indexes(array('article', 'product'))
  ->execute();

Search results are returned as “hits”, each hit contains a “document”. The data may look something like the following when represented as JSON:

{
    "id": "2347",
    "document": {
        "id": "2347",
        "title": "My Article",
        "description": "",
        "class": "Acme\\Bundle\\ArticleBundle\\Entity\\Article",
        "url": "\/admin\/articles\/edit/2347",
        "image_url": "",
        "locale": null
    },
    "score": 0.39123123123123
}

Indexing and deindexing

After you have mapped your object (see Mapping) you can index it:

<?php
$object = // your mapped object
$searchManager->index($object);

And deindex it:

<?php
$object = // your mapped object
$searchManager->deindex($object);

Flushing

Flushing will tell the search adapter to process all of its pending tasks (for example, indexing, deindexing) now. This is sometimes useful when you need to ensure that data in the search index is in a certain state before performing more processing (for example when testing).

<?php
$object = // your mapped object
$searchManager->flush();

Note that flushing is not required, and that it is better not to flush if you can avoid it.

Search Adapters

Zend Lucene

The Zend Lucene search is the default implementation. It requires no external dependencies. But be aware that the version used here is unmaintained and is not considered stable.

To enable add the following dependencies to your composer.json:

"require": {
    ...
    "zendframework/zend-stdlib": "2.3.1 as 2.0.0rc5",
    "zendframework/zendsearch": "2.*",
}

and select the adapter in your application configuration:

# app/config/config.yml
massive_search:
    adapter: zend_lucene

The search data is stored on the filesystem. By default it will be placed in app/data. This can be changed as follows:

# app/config/config.yml
massive_search:
    # ...
    adapters:
        zend_lucene:
            basepath: /path/to/data

Note

The Zend Lucene library was originally written for Zend Framework 1 (ZF1), it was later ported to Zend Framework 2 (ZF2) and made available through composer.

Neither the ZF1 or ZF2 versions are maintained, and the ZF1 version is more up-to-date than the ZF2 version which this library uses and neither are compatible with the Apache Lucene index format.

Long story short: the library is not maintained, but we have encountered no issues with it and it is the only native PHP search library.

Elasticsearch

The Elasticsearch adapter allows you to use the Elasticsearch search engine.

You will need to include the official client in composer.json:

"require": {
    ...
    "elasticsearch/elasticsearch": "~1.3",
}

and select the adapter in your application configuration:

# app/config/config.yml
massive_search:
    adapter: elastic

By default assumes the server is running on localhost:9200. You change this, or configure more severs as follows:

# app/config/config.yml
massive_search:
    # ...
    adapters:
        elastic:
            hosts: [ 192.168.0.63:9200, 192.168.0.63:9200 ]

Localization

The MassiveSearchBundle allows you to localize indexing and search operations.

To take advantage of this feature you need to choose a localization strategy:

# app/config/config.yml
massive_search:
    localization_strategy: index

The localization strategy decides how the documents are localized in the search implementation.

By default the adapter is the so-called noop which does nothing and so localization is effectively disabled.

Strategies

There are currently two localization strategies:

  • noop: No operation, this strategy does nothing, or in other words, it disables localization.
  • index: Creates an index per locale. For example if you store a document in an index named “foobar” with a locale of “fr” then the backend will use an index named “foobar_fr”.

Mapping

See Mapping

Web API

The MassiveSearchBundle includes a simple controller which will return a JSON response for search queries.

Configuration

Simply include the routing file from your main application:

massive_search:
    resource: "@MassiveSearchBundle/Resources/config/routing.yml"
    prefix: /admin

Querying

You can then issue queries and reveive JSON responses:

# GET /admin/search?q=Dan
[
    {
        "id": "2347",
        "document": {
            "id": "2347",
            "title": "Dan",
            "description": "",
            "class": "Acme\\Bundle\\ContactBundle\\Entity\\Contact",
            "url": "\/admin\/#contacts\/edit:2347",
            "image_url": "",
            "locale": null
        },
        "score": 0.30685281944005
    }
]

In specific indexes:

# GET /admin/search?q=Dan&index[0]=contact
# GET /admin/search?q=Dan&index[0]=contact&index[1]=product

or in a specific locale:

# GET /admin/search?q=Dan&locale=fr

Extending

You can extend the bundle by customizing the Factory class and with custom metadata drivers.

Factory

The factory service can be customized, enabling you to instantiate your own classes for use in any listeners which you register. For example, you want to add a “thumbnail” field to the Document object and create a custom document MyCustomDocument:

<?php

namespace My\Namespace;

use Massive\Bundle\SearchBundle\Search\Factory as BaseFactory;

class MyFactory extends BaseFactory
{
    public function makeDocument()
    {
        return MyCustomDocument();
    }
}

You must then register your factory as a service and register the ID of that service in your main application configuration:

massive_search:
    services:
        factory: my.factory.service

Metadata Drivers

Extend the Metadata\Driver\DriverInterface and add the tag massive_search.metadata.driver tag to your implementations service definition.

<service id="massive_search.metadata.driver.xml" class="%massive_search.metadata.driver.xml.class%">
    <argument type="service" id="massive_search.metadata.file_locator" />
    <tag type="massive_search.metadata.driver" />
</service>

This is non-trivial and you should use the existing XML implementation as a guide.

Events

The MassiveSearchBundle issues events which can be listened to by using the standard Symfony event dispatcher. You can register a listener in your dependency injection configuration as follows:

<!-- rebuild structure index on massive:search:index:rebuild -->
<service id="acme.event_listener.search"
class="Acme\Search\SearchListener">
    <tag name="kernel.event_listener" event="<event_name>" method="methodToCall" />
</service>
massive_search.hit

The SearchManager will fire an event of type HitEvent in the Symfony EventDispatcher named massive_search.hit.

The HitEvent contains the hit object and the reflection class of the object which was originally indexed.

For example:

<?php

namespace Sulu\Bundle\SearchBundle\EventListener;

use Massive\Bundle\SearchBundle\Search\Event\HitEvent;

class HitListener
{
    public function onHit(HitEvent $event)
    {
        $reflection = $event->getDocumentReflection();
        if (false === $reflection->isSubclassOf('MyClass')) {
            return;
        }

        $document = $event->getDocument();
        $document->setUrl('Foo' . $document->getUrl());
    }
}
massive_search.pre_index

Fired before a document is indexed. See the code for more information.

Commands

The MassiveBuildBundle provides some commands.

massive:search:query

Perform a query from the command line:

$ php app/console massive:search:query "Foobar" --index="barfoo"
+------------------+--------------------------------------+-----------+-------------+-----------+------------------------+
| Score            | ID                                   | Title     | Description | Url       | Class                  |
+------------------+--------------------------------------+-----------+-------------+-----------+------------------------+
| 0.53148467371593 | ac984681-ca92-4650-a9a6-17bc236f1830 | Structure |             | structure | OverviewStructureCache |
+------------------+--------------------------------------+-----------+-------------+-----------+------------------------+

massive:search:status

Display status information for the current search implementation:

$ php app/console massive:search:status
+-------------+--------------------------------------------------------------+
| Field       | Value                                                        |
+-------------+--------------------------------------------------------------+
| Adapter     | Massive\Bundle\SearchBundle\Search\Adapter\ZendLuceneAdapter |
| idx:product | {"size":11825,"nb_files":36,"nb_documents":10}               |
+-------------+--------------------------------------------------------------+

massive:search:index:rebuild

Rebuild the search index. This command issues an event which instructs any listeners to rebuild all of the mapped classes.

$ php app/console massive:search:index:rebuild
Rebuilding: Acme\Bundle\ContactBundle\Entity\Contact [OK] 1 entities indexed
Rebuilding: Acme\Bundle\ContactBundle\Entity\Account [OK] 0 entities indexed

Options:

  • purge: Purge each affected index before reindexing.
  • filter: Only apply rebuild to classes matching the given regex pattern, e.g. .*Contact$.