Python-twitterが欧米仕様だったので日本仕様に直したでござるの巻

半裸Botを作ったとき，PythonでTwitterAPIをほげほげするのにPython-twitterを使ったんだけど。ちょっと長いTweetをPostしようとすると，「140文字以上あるよ！」という例外が発生してしまう。実際Tweetは60文字くらいしかないんだけど，おかしい。

なんでだろうと思ってソースを見てみたら原因が判明。ユニコードの8ビット文字列を，バイト列として見ている。これだと日本語のマルチバイト文字列が3倍の文字数と判別されてしまう。ASCII圏のことしか考えないのは欧米製のライブラリにはよくあることで，ひょっとしたらとは思ってたけど。

ソースコードをちょっと書き換えてみた。

ApiクラスのPostUpdate()メソッドの出だしを以下のように書き換えた。「+」から始まる行が，書き換えたり追加したりした行です。

  def PostUpdate(self, status, in_reply_to_status_id=None):
    '''Post a twitter status message from the authenticated user.

    The twitter.Api instance must be authenticated.

    Args:
      status:
        The message text to be posted.  Must be less than or equal to
        140 characters.
      in_reply_to_status_id:
        The ID of an existing status that the status to be posted is
        in reply to.  This implicitly sets the in_reply_to_user_id
        attribute of the resulting status to the user ID of the
        message being replied to.  Invalid/missing status IDs will be
        ignored. [Optional]
    Returns:
      A twitter.Status instance representing the message posted.
    '''
    if not self._username:
      raise TwitterError("The twitter.Api instance must be authenticated.")

    url = 'http://twitter.com/statuses/update.json'

+    if not isinstance(status, unicode):
+        status = unicode(status, self._input_encoding)

    if len(status) > CHARACTER_LIMIT:
      raise TwitterError("Text must be less than or equal to %d characters. "
                         "Consider using PostUpdates." % CHARACTER_LIMIT)

+    data = {'status': status.encode('utf-8')}
    if in_reply_to_status_id:
      data['in_reply_to_status_id'] = in_reply_to_status_id

2010-08-27 04:53

trivial technologies

About me

Python-twitterが欧米仕様だったので日本仕様に直したでござるの巻