Skip to content
New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Solr 9 error while trying to push to Datastore. Missing required field: site_id #8423

Closed
dpuertolas opened this issue Sep 9, 2024 · 2 comments
Assignees

Comments

@dpuertolas
Copy link

CKAN version

CKAN version: 2.10.4
Solr version: I have tried with version 2.10-solr9 and by modifying the schema.xml file from a Solr 9.6.1 image.
Xloader version: 1.0.1

Describe the bug

I am trying to update the Solr version used in my CKAN installation (from Solr 6 to Solr 9) after upgrading CKAN from 2.9 to 2.10.
I reindex, and the datasets load correctly, but when attempting to upload resources to the Datastore, I receive the following error:

Error: Unable to add package to search index: Solr returned an error: Solr responded with an error (HTTP 400): [Reason: [doc=null] missing required field: site_id]

I have found this error in two unresolved issues that have been open for over a year:

They describe a similar behavior to mine.

Here is my list of plugins:

ckan.plugins = stats resource_proxy text_view image_view webpage_view datatables_view geo_view geojson_view wmts_view shp_view pdf_view datastore xloader harvest ckan_harvester csw_harvester gobierno_de_navarra_xlsx_harvest gobierno_de_navarra_json_harvest gobierno_de_navarra_csw_harvest base_templates scheming_datasets scheming_groups scheming_organizations fluent spatial_metadata spatial_query tableau_view datos_gob_es_federation gobierno_de_navarra activity

My portal is multilingual, so we use Fluent presets, although we are not using 'repeating_subfields' (as mentioned in some of the issues I referenced above).

I have checked the JSON that is sent to Solr before the error occurs, and it looks correct, containing the 'site_id' and the other fields.
When I tried modifying the Solr schema to make the 'site_id' field not required, I received the same error, but this time for the 'id' field.

@dpuertolas
Copy link
Author

Solved:

Due to the multi-language feature, there were fields of type dict in the data dictionary when uploading to the datastore. To solve this, in my plugin I developed a function that transforms these fields into strings:

def _before_index_dump_dicts(self, pkg_dict):
    for key, value in pkg_dict.items():
        if isinstance(value, dict):
            pkg_dict[key] = json.dumps(value)
    return pkg_dict

I execute this function in the before_dataset_index method from the ckan.plugins.interfaces.IPackageController interface.

@amercader
Copy link
Member

Glad you find the issue.
For anyone finding this error in the future, make sure that in all plugins you have loaded the before_dataset_index() (before_index() in old CKAN versions) return a valid dataset dict, and that the values of the fields can be indexed by Solr

mjanez added a commit to mjanez/ckanext-schemingdcat that referenced this issue Sep 18, 2024
- Added `_before_index_dump_dicts` method to `PackageController` class to convert dict fields in the data dictionary to JSON strings.
- Updated `before_dataset_index` method to call `_before_index_dump_dicts` to ensure all fields can be indexed by Solr.
- Enhanced docstring for `_before_index_dump_dicts` to explain the necessity of converting dict fields to JSON strings, referencing related issues and errors.
- Ensured that all fields in the data dictionary are in a format that Solr can handle, preventing errors such as "missing required field" even when the field is present.
- Addressed issues observed in CKAN versions 2.10.4 and Solr 9, where attempts to upload resources to the Datastore resulted in errors due to the presence of dict fields in the data dictionary (ckan/ckan#8423).
- Referenced related issues: CKAN - Custom plugin/theme error datastore using fluent presets #7750 and Solr error: missing required field #7730.
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

No branches or pull requests

2 participants