Drupalcon Denver...

Submitted by fago on Wed, 04/04/2012 - 16:00
At the Drupalcon Denver I've given a talk called "Drupalize your data: use entities!". Check out the recording of my talk and find the slides attached: Also, I've been instructing at the "Rules Mastery" training, together with Johan Falk, Dick Olsson and Klaus Purer. The training materials are all available online so check them out at http://tinyurl.com/rulesmastery! To get into Rules check out the The tiny book of Rules - kudos to Johan Falk again :-)

Drupal 8: Entities and data integration

Submitted by fago on Fri, 03/11/2011 - 19:41
As follow-up to my previous blog post Drupal 8: Approaching Content and Configuration Management, I'm going to shortly cover how the Entity API could help us with two more of Dries' points: Web services and information integration. First off, for getting RESTful web services into core, having a unified API to build upon makes lots of sense. That way we make sure we locally have the same uniform interface for CRUD functions available as we expose it to the web. But moreover, the possibility of having remote entities can help us a lot with integrating with remote systems. In a way, we'll get that anyway once we implement pluggable entity storage controllers (and you can even do so already in D7). But for that really being useful, we need to know the data we want to work with. This is why, I come up with the hook_entity_property_info() in the Entity API module for d7. While for d7 it is built on top of the stuff that is there anyway, I think it should play a much more central role in D8 for various reasons:
  • A description of all the data properties of an entity enables modules to deal with any entity regardless of the entity type just based on the available data properties (and their data types). That way, modules can seamlessly continue to work even with entities stemming from remote systems. This is how, the RestWS, SearchAPI and Rules modules already work in d7.
  • With pluggable storage backends, I see no point in SQL-centric schema information except we are going to use SQL based storage. By defining the property info, storage backends can work based on that though, i.e. generate the sql backend can generate the schema out of the property information.
  • When working with entities, what bothers is the data actually available in the object. To a module, the internal storage format doesn't matter. In a way, the property information defines the "contract" how the entity has to look like. Given that, all APIs should be built based on that, i.e. efq should accept the value to query for exactly the same way it appears in the entity and not in storage-backend dependend way (no one can predict once storage is pluggable). The very same way display formatters and form widgets could just rely on the described data too.
As we discussed in the entity API bof, it might make sense to build it around "field types" evolving to "data types" - thus being decoupled from being a "field". The very same way we can start building the display components around the entity properties, thus not necessarily being based on fields (bye bye field api "extra fields"). Any way the implementation will look like, let's let the entity API become our long missing data API!

Drupal 8: Approaching Content and Configuration Management

Submitted by fago on Wed, 03/09/2011 - 19:02
As a follow-up to heyrocker's great core conversations talk I'd like to share my thoughts on that topic. So is X content? Or configuration? As heyrocker pointed out: We shouldn't have to care. So yes, the line between content and configuration is blurry - stuff being content on a site, would more fit the configuration bill on another site. Still this doesn't mean we have to treat content and configuration exactly the same way. We can and we need to build separate systems for configuration management and content deployment, but the point is: We need one and the same type of objects to be able to behave as content and configuration dependent on the actual requirements of the site. Not necessarily out-of-the box, but well, it should be doable such that people can implement their use-cases. To facilitate that, we need a foundational unified API, an API that deals with objects the same way regardless whether its configuration or content, an API that allows us to fetch any object, to export it, and to import and save it on another site. As said in the core conversation, I think the entity API perfectly fits into that. That way, an entity would be basically any data object which is integrated into Drupal, such we have a unified CRUD API - and a unified way to import and export that data. It should be able to deal with machine names or auto-incremented numeric id, but as well we might want it to be able to deal with UUIDs out of the box. So we'd finally get a unified data API and exportability, the very much foundation for solving the configuration management and content deployment problems. But for that to take off, we need to keep the entity concept slim and don't bake too much assumptions into it. Is it content or configuration? Is it user-facing? Is it fieldable or viewable/renderable? Well maybe, maybe not. So while it makes a lot of sense to build more APIs around entities, we should never actually require them. Instead, we could just provide the APIs such that any entity type is able to opt-in, if it fits the bill. In addition to that, I'd like to share my thoughts on how the Entity API could help us to cover 2 more points Dries mentioned in his key-note: Web services, and information integration. I'll come back to that in a follow-up post.

Restful web services in Drupal 7

Submitted by fago on Mon, 01/31/2011 - 14:07

During the work on my thesis over the last year, I played around a lot with RESTful services based upon the Entity API. What I needed was a simple service that just exposes Drupal's entities in a RESTful manner, while obeying Drupal's permission and access systems. Now, me and klausi have created a small module that does exactly that: Restful web services.

So how does it work?

The module makes use of the Entity API and the information about entity properties (provided via hook_entity_property_info()) to provide resource representations for all entity types (nodes, comments, users, taxonomy terms, ..). It aims to be fully compliant to the REST principles. Drupal's entities are exposed at the unified $entity_type/$id paths, while respecting the Content Accept/Content Type headers of the HTTP requests. That means if a client requests node/1 with usual HTTP accept headers it will get Drupal's usual output, if it requests node/1 while accepting only JSON, it will get the JSON representation of the node. Similarly, all CRUD operations are supported as common for RESTful services. Then, the module supports GET requests on paths like node/1.json, node/1.xml or node/1.rdf too.

And authentication...?

As mentioned above, the solution just obeys Drupal's permission and access system. If there is an active session and the user has sufficient permission for the request, it will be served. So any add-on authentication strategies would have to plug into Drupal's usual user system. For example, the RestWS module comes with a small add-on module that authenticates users via HTTP basic authentication. So you can define a regular user for a client, configure their access permissions as usual, and just pass its credentials with a request.

So what about the property information?

The module makes use of the property information the entity API collects for all entity types, as well as the accompanying wrapper classes. While the API also allows providing non-entities as resources, it requires the existence of property information. Representations of entities are provided according to their property information. What does that mean?
So let's have a look at an example: The node author. In the property information about nodes, there is no uid property, instead there is an 'author' property, pointing to the according user entity. So the module makes use of that information to output a proper reference to the author, being the author's URI (URIs are the proper way to do references in RESTful designs). So instead of just outputting user id as uid property with an integer value, we output a proper reference to the node's author. Apart from that, the property information includes access permissions - so updating the node author will only be possible if you have sufficient permissions.
Then the property information could be used to provide a description of the web service for the caller, in a human as well as in a machine-readable way.

Which formats are supported?

The module currently comes with support for JSON, XML and RDF/XML whereas modules may add more formatters. As the property information is available to the formatters too, it's possible to do formatters that output some properties in a certain way, e.g. using a special XML namespace. Similarly the RDF formatter looks up the RDF mapping being defined for a property, in order to generate meaningful RDF output.

What's different to the Services module?

The main differences are:

* RestWS provides only RESTful services (no message-oriented or RPC-style web services like SOAP, XML-RPC etc.).
* RestWS strongly builds upon the Entity API and its property information, thus utilizes it for CRUD, access checks, getting property information, ..
* Property information is built into the API, so formatters may make use of it to format the data in a sensible way.
* There are no "service endpoints" to configure as resources are just available at uniform paths like node/1, user/1. We do not see a need to have multiple endpoints for the same resource in a RESTful desgin.

For more about the relation and partial overlap to the Services module, read and participate in the discussion over at http://drupal.org/node/1042512.

Thinking Drupal 8 and beyond.

Submitted by fago on Tue, 01/11/2011 - 09:39
I'd like to share some of my thoughts and long-term visions for Drupal 8 and beyond:

1. Full CRUD for the Entity API

In the long-term I want to see the Entity API becoming our main CRUD-API, on which modules may build upon. For D8 I do think for every entity should be based upon a class implementing the "EntityInterface", which provides some simple methods to easily access identifiers, revision ids, labels, uris as well as save(), delete(), .. methods.

2. Improved DX for fields

Now, we have two kind of entity properties: Fields and non-fields. So should one use an entity property or a field? We have some nice APIs around fields, but they are not built for daily developer usage so programmatically re-using fields is no fun. Still, developers can go without a field for any custom data storage, but then we are loosing all the advantages fields come with - like flexible storage or the awesome module support (which I've tried to solve in D7 via hook_entity_property_info()). Once we have improved DX for fields in place, developers can easily embrace it and benefit from its advantages.

3. Everything is a field

So why not adopt fields for any entity property? So we could make entity properties easily translatable via the field API and benefit from the improved out-of-the-box module integration and stuff already written for fields, like widgets and formatters . Of course, some fields need to be hidden from the UI then.

4. Storage APIs

With everything being a field, entity data would be scattered around in lots of db tables. Also, it should be possible to use the API to register any remote data object as entity. So we need to have entity-storage and field-storage backends, such that also the remote-data-entity can have fields stored in the local database. Thus, with everything being a field we need to allow developers to delegate field-storage to the entity.

5. Describing data

Also, as of now field types have to describe the db schema to be used for saving. However, the schema API is built for the database system so it has no notion of describing stuff beyond it, like that a timestamp is a date. So maybe the contract between the storage API and the field system should not be the db schema, but an actual description of the data to be saved. I.e. instead of telling the system to save an integer which will be used for saving a node id, tell it that it has to save a reference to a node. Apart from that, the described data structure is what other APIs built around fields (widgets, formatters) have to use (or should use -> query), just as any developer working with fields. So simultaneously, modules making use of entities and their fields could rely on that information, e.g. to determine all entity references or just to get some data values of a certain type, e.g. textual values for token replacements.

6. Profile2 in core

With 1) in place, it should be rather easy to replace our old profile module with something new built upon entities and fields like profile2. We'll see how profile2 does for d7.

7. Rules in core

I'd really like to work on bringing a slightly simplified version of the foundational API of rules into core, thus re-placing the current action system. However with Rules 2.x the whole API is built around the way of describing data utilized for hook_entity_property_info() as well as on entities. Thus for Rules in core making sense, it would need something comparable in core - e.g. point 5).

Metadata, what for? - Introducing Entity Metadata!

Submitted by fago on Wed, 08/04/2010 - 13:20
Update 10.01.2011: In the meantime the Entity metadata module got merged into the main "entity" API module.
Drupal 7 modules working with entities often face the same problems:
  • How to create/save/delete an entity?
  • How to get referenced entities?
  • Which properties are there and how can they be accessed or modified?
This is, what Entity Metadata tries to solve for Drupal 7. It collects metadata from modules, such that it knows how this things can be done and provides API functions for that purpose. There are API functions for full entity CRUD, for determining access, as well as data wrappers that simplify dealing with entity properties.

Metadata for data properties, why that?

You might think, we have fields. Yes we have, but not everything is a field. There are also entity properties, like the node title and author, the term hierarchy and lots of others. Entity metadata collects information about all that available properties - regardless whether they are fields or not - and makes them accessible the same way. For that you have to provide property info via a hook, e.g. this is the info the module provides for books: