Index different CKAN entities in Solr, not just datasets

Overview

ckanext-sitesearch

Index different CKAN entities in Solr, not just datasets

Requirements

This extension requires CKAN 2.9 or higher and Python 3

Features

Search actions

ckanext-sitesearch allows Solr-powered searches on the following CKAN entities:

Entity Action Permissions Notes
Organizations organization_search Public
Groups group_search Public
Users user_search Sysadmins only
Pages page_search Public (individual page permissions apply) Requires ckanext-pages

All *_search actions support most of the same paramters that package_search, except the facet* and include_* ones. That includes q, fq, rows, start and sort.

In all actions, the output matches the one of package_search as well, an object with a count key and a results one, wich is a list of the corresponding entities dict (ie the result of organization_show, user_show etc):

, , ] } ">
{
    "count": 2,
    "results": [
        
    
     ,
        
     
      ,
    ]
}


     
    

Additionally the plugin registers a site_search action that performs a search across all entities that the user is allowed to, including datasets. Results are returned in an object including the keys for which the user has permission to search on. For instance for a sysadmin user that has access to all searches:

, "organizations": , "groups": , "users": , "pages": }">
{
    "datasets": 
       
        ,
    "organizations": 
        
         ,
    "groups": 
         
          ,
    "users": 
          
           ,
    "pages": 
           
             } 
           
          
         
        
       

For each item, the results object is the one described above (count and results keys).

Note that all parameters are passed unchanged to each of the search actions, so this site-wide search is mostly useful for free-text searches like q=flood.

CLI

The plugin inlcudes a ckan command to reindex the current entities in the database in Solr:

ckan sitesearch rebuild 
   

   

Where entity_type is one of organizations, groups, users or pages. You can also pass the id or name of a particular entity to index just that particular one:

ckan sitesearch rebuild organization department-of-transport

Check the command help for additional options:

ckan sitesearch rebuild --help

Installation

To install ckanext-sitesearch:

  1. Activate your CKAN virtual environment, for example:

    . /usr/lib/ckan/default/bin/activate

  2. Clone the source and install it on the virtualenv

    git clone https://github.com/okfn/ckanext-sitesearch.git cd ckanext-sitesearch pip install -e . pip install -r requirements.txt

  3. Add sitesearch to the ckan.plugins setting in your CKAN config file (by default the config file is located at /etc/ckan/default/ckan.ini).

  4. Restart CKAN

Config settings

None at present

Developer installation

To install ckanext-sitesearch for development, activate your CKAN virtualenv and do:

git clone https://github.com/okfn/ckanext-sitesearch.git
cd ckanext-sitesearch
python setup.py develop

Tests

To run the tests, do:

pytest --ckan-ini=test.ini

License

AGPL

Owner
Open Knowledge Foundation
Also find us at: @frictionlessdata @opentrials @openspending @openknowledge-archive
Open Knowledge Foundation
A website which allows you to play with the GPT-2 transformer

transformers A website which allows you to play with the GPT-2 model Built with ❤️ by raphtlw Table of contents Model Setup About Contributors Model T

raphtlw 2 Jan 27, 2022
The Easy-to-use Dialogue Response Selection Toolkit for Researchers

The Easy-to-use Dialogue Response Selection Toolkit for Researchers

GMFTBY 32 Nov 13, 2022
ADCS - Automatic Defect Classification System (ADCS) for SSMC

Table of Contents Table of Contents ADCS Overview Summary Operator's Guide Demo System Design System Logic Training Mode Production System Flow Folder

Tam Zher Min 2 Jun 24, 2022
Super easy library for BERT based NLP models

Fast-Bert New - Learning Rate Finder for Text Classification Training (borrowed with thanks from https://github.com/davidtvs/pytorch-lr-finder) Suppor

Utterworks 1.8k Dec 27, 2022
Searching keywords in PDF file folders

keyword_searching Steps to use this Python scripts: (1)Paste this script into the file folder containing the PDF files you need to search from; (2)Thi

1 Nov 08, 2021
A music comments dataset, containing 39,051 comments for 27,384 songs.

Music Comments Dataset A music comments dataset, containing 39,051 comments for 27,384 songs. For academic research use only. Introduction This datase

Zhang Yixiao 2 Jan 10, 2022
PG-19 Language Modelling Benchmark

PG-19 Language Modelling Benchmark This repository contains the PG-19 language modeling benchmark. It includes a set of books extracted from the Proje

DeepMind 161 Oct 30, 2022
Code repository of the paper Neural circuit policies enabling auditable autonomy published in Nature Machine Intelligence

Code repository of the paper Neural circuit policies enabling auditable autonomy published in Nature Machine Intelligence

9 Jan 08, 2023
PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers

PeCo: Perceptual Codebook for BERT Pre-training of Vision Transformers

Microsoft 105 Jan 08, 2022
:house_with_garden: Fast & easy transfer learning for NLP. Harvesting language models for the industry. Focus on Question Answering.

(Framework for Adapting Representation Models) What is it? FARM makes Transfer Learning with BERT & Co simple, fast and enterprise-ready. It's built u

deepset 1.6k Dec 27, 2022
AEC_DeepModel - Deep learning based acoustic echo cancellation baseline code

AEC_DeepModel - Deep learning based acoustic echo cancellation baseline code

凌逆战 75 Dec 05, 2022
SHAS: Approaching optimal Segmentation for End-to-End Speech Translation

SHAS: Approaching optimal Segmentation for End-to-End Speech Translation In this repo you can find the code of the Supervised Hybrid Audio Segmentatio

Machine Translation @ UPC 21 Dec 20, 2022
Code for evaluating Japanese pretrained models provided by NTT Ltd.

japanese-dialog-transformers 日本語の説明文はこちら This repository provides the information necessary to evaluate the Japanese Transformer Encoder-decoder dialo

NTT Communication Science Laboratories 216 Dec 22, 2022
SimCTG - A Contrastive Framework for Neural Text Generation

A Contrastive Framework for Neural Text Generation Authors: Yixuan Su, Tian Lan,

Yixuan Su 345 Jan 03, 2023
Code Implementation of "Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction".

Span-ASTE: Learning Span-Level Interactions for Aspect Sentiment Triplet Extraction ***** New March 31th, 2022: Scikit-Style API for Easy Usage *****

Chia Yew Ken 111 Dec 23, 2022
Jarvis is a simple Chatbot with a GUI capable of chatting and retrieving information and daily news from the internet for it's user.

J.A.R.V.I.S Kindly consider starring this repository if you like the program :-) What/Who is J.A.R.V.I.S? J.A.R.V.I.S is an chatbot written that is bu

Epicalable 50 Dec 31, 2022
pyMorfologik MorfologikpyMorfologik - Python binding for Morfologik.

Python binding for Morfologik Morfologik is Polish morphological analyzer. For more information see http://github.com/morfologik/morfologik-stemming/

Damian Mirecki 18 Dec 29, 2021
Making text a first-class citizen in TensorFlow.

TensorFlow Text - Text processing in Tensorflow IMPORTANT: When installing TF Text with pip install, please note the version of TensorFlow you are run

1k Dec 26, 2022
An assignment on creating a minimalist neural network toolkit for CS11-747

minnn by Graham Neubig, Zhisong Zhang, and Divyansh Kaushik This is an exercise in developing a minimalist neural network toolkit for NLP, part of Car

Graham Neubig 63 Dec 29, 2022
It analyze the sentiment of the user, whether it is postive or negative.

Sentiment-Analyzer-Tool It analyze the sentiment of the user, whether it is postive or negative. It uses streamlit library for creating this sentiment

Paras Patidar 18 Dec 17, 2022