Character encoding in script console

Post Reply
jszpyrka
Posts: 13
Joined: 06 Dec 2012, 11:28

Hello,

when in script console i write

Code: Select all

print("ąśćę")
and choose BeanShell or Groovy the result is ok, but when I choos Python there is some problem with character encoding.

How to configure python in protege to use utf-8 encoding?

best regards
Jacek Szpyrka
User avatar
neil.walsh
Posts: 444
Joined: 16 Feb 2009, 13:45
Contact:

Hi Jacek,

We'll bring this up as an issue with the Protege team. I suspect the script tab may be using v1.x of Python but I'm uncertain and will need to do some investigating.

I'll post back should we find a solution for this.

Thanks

Neil
User avatar
jonathan.carter
Posts: 1087
Joined: 04 Feb 2009, 15:44

Neil is correct. There are some particular things that you need to do with Python to make it handle UTF-8 or unicode characters.

If you know the unicode characters, you can use the /uNNNN approach to specify the character.

e.g. here is the Python code to add some details about a Chinese Language:

Code: Select all

aNewInstance = GetInstanceByName("Report_Language", u'Report Language: Chinese (Traditional) - \u4E2D\u6587 (\u7E41\u9AD4)')
Note that to get Python to recognise that you're using a unicode string, prefix the string with 'u', e.g. u'my string in Unicode'

I also found that I needed to add the following first line to any Python script to tell the Python environment that all that follows is in UTF-8

Code: Select all

# -*- coding: UTF-8 -*-
Jonathan
Essential Project Team
Post Reply