AJING NOTE: [GAE] Google Cloud Datastore的搜尋方法

這篇文章記錄Google Cloud Datastore中的搜尋方法。

1. query()：要注意query()是回傳Query物件，還不是Entities

我們可以使用 query() 方法在Google Cloud Datastore中搜尋特定條件的entities。

一般我們會包裝在Model Class中做為class method呼叫。

class Greeting(ndb.Model):
    """Models an individual Guestbook entry with content and date."""
    content = ndb.StringProperty()
    date = ndb.DateTimeProperty(auto_now_add=True)

    @classmethod
    def query_book(cls, ancestor_key):
        return cls.query(ancestor=ancestor_key).order(-cls.date)

class MainPage(webapp2.RequestHandler):
    GREETINGS_PER_PAGE = 20

    def get(self):
        guestbook_name = self.request.get('guestbook_name')
        ancestor_key = ndb.Key('Book', guestbook_name or '*notitle*')
        greetings = Greeting.query_book(ancestor_key).fetch(self.GREETINGS_PER_PAGE)

        self.response.out.write('<html><body>')

        for greeting in greetings:
            self.response.out.write('<blockquote>%s</blockquote>' % cgi.escape(greeting.content))

        self.response.out.write('</body></html>')

* Client端送出含guestbook_name參數的HTTP GET，Server端以Key('kind', id)得到符合的guestbook的key，並以搜尋這個guestbook底下的message。

* cgi.espcape()將字串的HTML character entities轉成escape sequence。

query物件可以使用filter()做更進一步搜尋。

query1 = Account.query()  # Retrieve all Account entitites
query2 = query1.filter(Account.userid >= 40)  # Filter on userid >= 40

另外還有order()方法進行排序。

query = Greeting.query().order(Greeting.content, -Greeting.date) ## 負號代表降序

最後以fetch()方法得到所有符合query的entities，fetch()可傳入數字代表回傳前幾個entities就好。

2. 搜尋條件：

基本上就是長這樣：model.query(model.property == value)，也可以使用 !=、>、<、>=...等常見比較符號。
比較特別的是IN， model.property.IN([value1, value2])，這個意思等同 (model.property==value1) OR (model.property==value2)。
若要使用多條件只要使用逗點隔開就好，例如： Student.query(Student.age > 10, Student.age <= 12)。

也可使用多個AND與OR，但太過複雜的條件的話易會報錯，要經過正規化再使用，下面是官方的例子：

query = Article.query(ndb.AND(Article.tags == 'python',
                              ndb.OR(Article.tags.IN(['ruby', 'jruby']),
                                     ndb.AND(Article.tags == 'php',
                                             Article.tags != 'perl'))))

正規化步驟：
(1) 將 IN 與 != 展開，property.IN([a, b])等同OR(property==a, property==b)，property!=a等同OR(property>a, property<a)
(2) AND(a, b, OR(c, d)) -> OR(AND(a, b, c), AND(a, b, d))：
(3) AND(a, b, AND(c, d)) -> AND(a, b, c, d)：
(4) OR(a, b, OR(c, d)) -> OR(a, b, c, d)：
結果：

OR(AND(tags == 'python', tags == 'ruby'),
   AND(tags == 'python', tags == 'jruby'),
   AND(tags == 'python', tags == 'php', tags < 'perl'),
   AND(tags == 'python', tags == 'php', tags > 'perl'))

3. Ancestor Query：

ancestor的設計幫助我們的資料有更高的一致性，但其有個缺點，在同個ancestor下的entities在寫入時每秒只能有一次寫入 (entities with the same ancestor are limited to 1 write per second)。

來看看下面兩種例子：
(1) non-ancestor example：

class Customer(ndb.Model):
    name = ndb.StringProperty()

class Purchase(ndb.Model):
    customer = ndb.KeyProperty(kind=Customer)
    price = ndb.IntegerProperty()

purchases = Purchase.query(Purchase.customer == customer_entity.key).fetch()

這樣的寫法能讓我們快速寫入，但只是eventual consistency，我們只能得到新寫入的purchase。

(2) ancestor example：

class Customer(ndb.Model):
    name = ndb.StringProperty()

class Purchase(ndb.Model):
    price = ndb.IntegerProperty()

purchases = Purchase(parent=customer_entity.key).fetch()

每個purchase key上會依附這一 customer key，如此才能以customer搜尋到底下相關的purchases。

4. 使用字串來得到property：

property_to_query = 'location'
query = FlexEmployee.query(ndb.GenericProperty(property_to_query) == 'SF') ## Expando

query = Article.query(Article._properties[keyword] == value)

query = Article.query(getattr(Article, keyword) == value)

5. Query Cursors：

query內部的指針。使用fetch_page()回傳一triple：(results, cursor, more) ，more是一個flag代表是否還有其他的結果，
直接看Code比較好懂。

from google.appengine.datastore.datastore_query import Cursor

class List(webapp2.RequestHandler):
    GREETINGS_PER_PAGE = 10

    def get(self):
        """Handles requests like /list?cursor=1234567."""
        cursor = Cursor(urlsafe=self.request.get('cursor'))
        greets, next_cursor, more = Greeting.query().fetch_page(
            self.GREETINGS_PER_PAGE, start_cursor=cursor)

        self.response.out.write('<html><body>')

        for greeting in greets:
            self.response.out.write('<blockquote>%s</blockquote>' % cgi.escape(greeting.content))

        if more and next_cursor:
            self.response.out.write('<a href="https://www.blogger.com/list?cursor=%s">More...</a>' %
                                    next_cursor.urlsafe())

        self.response.out.write('</body></html>')

* Note the use of urlsafe() and Cursor(urlsafe=s) to serialize and deserialize the cursor.

6. Calling a Function for each Entity ("Mapping")：

使用map(callback)能做到平行加速。

### slow ###
message_account_pairs = []
for message in message_query:
    key = ndb.Key('Account', message.userid)
    account = key.get()
    message_account_pairs.append((message, account))

### faster ####
def callback(message):
    key = ndb.Key('Account', message.userid)
    account = key.get()
message_account_pairs = message_query.map(callback)

7. GQL：

GQL類似SQL語言，用來取Google Cloud Datastore中的entities。
(1) ndb.gql(querystring) 回傳 Query object
(2) Model.gql(querystring) is a shorthand for ndb.gql("SELECT * FROM Model " + querystring).

參考資料：
1. 官方資料：https://cloud.google.com/appengine/docs/standard/python/ndb/queries

AJING NOTE

首頁

2019年4月22日星期一

[GAE] Google Cloud Datastore的搜尋方法

沒有留言:

張貼留言

首頁

2019年4月22日 星期一

[GAE] Google Cloud Datastore的搜尋方法

沒有留言:

張貼留言

2019年4月22日星期一