Looking for more information on how to do PHP the right way? Check out PHP: The Right Way

SitePoint PHP Blog:
Turning a Crawled Website into a Search Engine with PHP
Jul 06, 2015 @ 10:19:43

The SitePoint PHP blog has posted the second part of their "Powerful Custom Search Engines with Diffbot" series with part two showing how to take the Diffbot results and make them searchable.

In the previous part of this tutorial, we used Diffbot to set up a crawljob which would eventually harvest SitePoint’s content into a data collection, fully searchable by Diffbot’s Search API. We also demonstrated those searching capabilities by applying some common filters and listing the results. [...] In this part, we’ll build a GUI simple enough for the average Joe to use it, in order to have a relatively pretty, functional, and lightweight but detailed SitePoint search engine. What’s more, we won’t be using a framework, but a mere total of three libraries to build the entire application.

For those interested in the end result, you can skip to the demo. Otherwise, they'll walk you through the full process:

  • Bootstrapping the environment and needed libraries
  • Creating a simple "home" page with a Diffbot client
  • Creating the frontend interface (a form allowing for various search terms)
  • Making the Javascript to catch the form submission
  • Adding CSS to style the page
  • Building out the PHP backend to perform the different search types (author and keywords)

Finally he ties it all together and create the output of the search results, providing links to each of the matching pages, posting date, author information and a brief summary. He ends the post with a look at paginating the results via a "PaginationHelper" class that will drop a navigation item at the bottom of the results and handle moving from page to page, interfacing with the Diffbot client.

tagged: search engine diffbot tutorial series part2 results crawled website

Link: http://www.sitepoint.com/turning-crawled-website-search-engine-php/

SitePoint PHP Blog:
Crawling and Searching Entire Domains with Diffbot
Jul 02, 2015 @ 09:41:39

The SitePoint PHP blog has a new tutorial posted, the first part in a new series, showing you how to create a "powerful custom search engine" with the help of the Diffbot service. In this first part they help you get everything you need set up (including a VM to run it from).

In this tutorial, I’ll show you how to build a custom SitePoint search engine that far outdoes anything WordPress could ever put out. We’ll be using Diffbot as a service to extract structured data from SitePoint automatically, and this matching API client to do both the searching and crawling. I’ll also be using my trusty Homestead Improved environment for a clean project, so I can experiment in a VM that’s dedicated to this project and this project alone.

He walks you through each step of the process, first creating the "crawljob" script and then executing it to gather the results. He also shows how to show this information via a simple GUI when searches are performed. A Diffbot PHP client library makes creating the crawljob simpler and lets you configure things like max number of items to crawl, patterns to match and what URLs to follow on the pages. Running the script creates the job which is then executed immediately. The same library makes search the data simpler too, using a "search" method along with some special tagging, and returning a JSON result with the matching records.

tagged: crawl domain diffbot search engine part1 series tutorial

Link: http://www.sitepoint.com/crawling-searching-entire-domains-diffbot/

Jonathan Wage:
Using the Symfony Expression Language for a Reward Rules Engine
May 28, 2015 @ 10:07:27

Jonathan Wage has a new tutorial on his site showing you how to use the Symfony Expression Language to create simple logic statements. He illustrates with a project they (OpenSky) applied it on - a "reward" rules engine.

We recently adopted the Symfony Expression Language in the rules engine at OpenSky. It has brought a new level of flexibility to our system and creating new logic has never been easier. [...] The expression language allows you to perform expressions that get evaluated with raw PHP code and return a single value. It can be any type of value and is not limited to boolean values.

He starts with a simple example, showing how it can return a boolean based on the results of an evaluation of an array of data. He then takes this up to the next level and use it with a Doctrine object, evaluating the results of methods to apply "rewards" to a user's account. He shows how to define the Doctrine objects with the necessary methods, how to write the rule and a lookup class to find rules that apply to the current situation.

tagged: symfony expression language rules engine tutorial doctrine object

Link: http://jwage.com/post/76799775984/using-the-symfony-expression-language-for-a-reward

Composer files being indexed by Google
Dec 10, 2014 @ 11:36:55

In an interesting thread on the /r/php subreddit on Reddit.com, a user noticed that Google is indexing Composer files that are in the document root of PHP applications. These files, like "composer.json" and "composer.lock" can provide detailed information about which packages and libraries are in use in the application (information disclosure).

The problem is that these files are placed in the web root of the application and not in a folder one level up, a recommended practice. The post links to a Google search that shows an example of current sites with the issue.

Another comment in the same post also reminds users not to have things like their ".git" files in the document root either as they can provide valuable information to would be attackers about your application's code. Things can be done to prevent direct access to these files in the web server configuration but it's far better to restructure the application to have them in a parent directory of the actual web root.

tagged: composer files composerlock composerjson index google search engine security

Link: http://www.reddit.com/r/PHP/comments/2ourf7/composer_files_being_indexed_by_google/

PHP Town Hall:
Episode 30: Specs, Implementations, and New Engines OH MY!
Aug 26, 2014 @ 15:23:59

The PHP Town Hall podcast has posted their latest episode today with hosts Phil Sturgeon and Ben Edmunds with a few special guests: "Specs, Implementations, and New Engines OH MY!"

This week Ben and Phil are joined by core PHP developer extraordinaires Andrea Faulds and Levi Morrison. We discuss the new PHP engine spec, various RFCs, and all things internals. Also PHP 6 is officially dead, let’s have a moment of silence.

You can check out this latest episode either through the in-page audio player, by downloading over on YouTube.

tagged: phptownhall ep30 specs implementation engine podcast

Link: http://phptownhall.com/blog/2014/08/25/episode-30-specs-implementations-and-new-engines-oh-my/

SitePoint PHP Blog:
Using Solarium with SOLR for Search – Implementation
May 07, 2014 @ 10:54:10

The SitePoint PHP blog has posted the third part of their series looking at using the Solarium tool to hook your PHP application into a SOLR search instance. In this latest part of the series they get down to the actual search implementation.

In the first part I introduced the key concepts and we installed and set up SOLR. In part two we installed and configured Solarium, a library which enables us to use PHP to “talk” to SOLR as if it were a native component. Now we’re finally ready to start building the search mechanism, which is the subject of this installment.

He starts with a simple search example, making a request to select the matches for a given query (given on the URL as a variable "q"). He shows how to run the select and fetch the results as a result set. He enhances this, containing the search logic inside a class and making a template to show the results. He also includes examples of how to use the "Disjunction Max", sorting and pagination functionality. Finally, he looks at a more complex type of search, a faceted search, and includes code examples of making the request and displaying the results.

tagged: solr solarium search engine tutorial implement basics faceted

Link: http://www.sitepoint.com/using-solarium-solr-search-implementation/

SitePoint PHP Blog:
Using Solarium with SOLR for Search - Setup
May 02, 2014 @ 11:49:16

The SitePoint PHP blog has posted a tutorial showing you how to use the Solarium library to search SOLR. Solarium is a PHP-based, open source tool that helps make interfacing with a SOLR search instance much easier. This post is part one of a larger series covering the combination of SOLR and Solarium.

Apache’s SOLR is an enterprise-level search platform based on Apache Lucene. It provides a powerful full-text search along with advanced features such as faceted search, result highlighting and geospatial search. [...] If you’re using PHP then the Solarium Project makes integration even easier, providing a level of abstraction over the underlying requests which enables you to use SOLR as if it were a native implementation running within your application. In this series, I’m going to introduce both SOLR and Solarium side-by-side.

He starts with some of the basic concepts behind what SOLR is, what kinds of things it's useful for and how to get it installed on your system (using Homebrew). He shows how to set up a sample schema including a detailed look at the different types and required fields it will need. As this is just the first part of the series, it stops there and will get into the actual PHP code for the interface in the next edition.

tagged: solr solarium search engine tutorial interface opensource library

Link: http://www.sitepoint.com/using-solarium-solr-search-setup/

Anthony Ferrara:
An Opinion On The Future Of PHP
Mar 10, 2014 @ 09:41:40

In his latest post Anthony Ferrara shares some of his personal opinions about the future of PHP and how some of the pieces in play now might fit in.

There's been a lot of buzz in the community lately around PHP and its future. The vast majority of this buzz has been distinctly positive, which is awesome to hear. There's been a lot of talk about PHP6 and what that might look like. There's been a lot of questions around HHVM and its role in the future of the language and community. Well, let me share with you some of my thoughts in this space...

He covers a few different topics including backwards compatibility, the suggestions of a complete engine rewrite and turning the SPL all OOP. He spends most of the post talking about HHVM (the HipHop VM), how it compares to "plain old PHP" and why it's not exactly "magic".

tagged: opinion future language hhvm hack engine backwards compatibility

Link: http://blog.ircmaxell.com/2014/03/an-opinion-on-future-of-php.html

Nikita Popov:
Fast request routing using regular expressions
Feb 19, 2014 @ 09:03:07

In his latest post Nikita Popov talks about routing and regular expresions. He also shares some work he's done to create a fast request router using them in "userland" code instead of a C extension.

Some time ago I stumbled on the Pux routing library, which claims to implement a request router that is many orders of magnitude faster than the existing solutions. In order to accomplish this, the library makes use of a PHP extension written in C. However, after a cursory look at the code I had the strong suspicion that the library was optimizing the wrong parts of the routing process. [...] To investigate the issue further I wrote a small routing library: FastRoute. This library implements the dispatch process that I will describe below.

He includes some benchmarks against the results from a C-based routing engine showing his solution performing slightly better. What he's really talking about, though, is the dispatch process in general, not just his implementation. He talks about "the routing problem" many engines face - having to loop through a potentially large set of routes to find a match. He offers an alternative using regular expressions and compiling all of the routes down into one large expression. He includes a simple implementation of the method and reruns the same benchmarks with some different results. He offers one potential solution for speeding it up using "chunked expressions" to break it down into more manageable matching. He includes benchmarks for this last solution as well, showing a slight improvement.

tagged: regularexpression routing dispatch engine chunk compile

Link: http://nikic.github.io/2014/02/18/Fast-request-routing-using-regular-expressions.html

Lately in PHP Podcast #43 - "Is Facebook HHVM going to Replace Zend Engine in PHP6"
Jan 20, 2014 @ 11:36:41

On the PHPClasses.org site today they've published the latest episode in their "Lately in PHP" podcast series, Episode #43 - "Is Facebook HHVM going to Replace Zend Engine in PHP 6".

The Facebook HipHop Virtual Machine, HHVM, has been evolving a lot, so PHP developers are considering it as a possible replacement for Zend Engine in PHP 6. This was one of the main topics discussed by Manuel Lemos and César Rodas in the episode 43 of the Lately in PHP podcast. They also discussed other topics like FastCGI support in HHVM, having PHP function naming consistency plans for PHP 6, TLS peer verification for secure connections, and using Composer to install JavaScript, CSS and images for PHP projects.

You can listen to this latest episode either through the in-page player, by downloading the mp3 or watching the live video recording from the Google Hangout.

tagged: hhvm zend engine php6 podcast latelyinphp episode

Link: http://www.phpclasses.org/blog/post/225-Is-Facebook-HHVM-going-to-Replace-Zend-Engine-in-PHP-6--Lately-in-PHP-podcast-episode-43.html