Regex search is extremely slow with python3

Hi.

As is described [here](http://api.mongodb.org/python/2.8/api/bson/regex.html):

> in Python 3, a regular expression compiled from a str has the re.UNICODE flag set.

But unicode regexps is extremely slow:

```
2015-04-18T23:44:29.539+0300 I QUERY    [conn2] query test.computers query: { $or: [ { number: "test" }, { hostname: /^test/u }, { macs: /^test/u }, { ipmi: /^test/u }, { dc: /^test/u } ] } planSummary: IXSCAN { hostname: 1 }, IXSCAN { number: 1 }, IXSCAN { dc: 1 }, IXSCAN { macs: 1 }, IXSCAN { ipmi: 1 } ntoreturn:100 ntoskip:0 nscanned:498422 nscannedObjects:104692 keyUpdates:0 writeConflicts:0 numYields:4391 nreturned:3 reslen:643 locks:{} 1001ms

```

Non-unicode version:

```
2015-04-18T23:42:33.177+0300 I QUERY    [conn1] query test.computers query: { $or: [ { number: "test" }, { hostname: /^test/ }, { macs: /^test/ }, { ipmi: /^test/ }, { dc: /^test/ } ] } planSummary: IXSCAN { hostname: 1 }, IXSCAN { number: 1 }, IXSCAN { dc: 1 }, IXSCAN { macs: 1 }, IXSCAN { ipmi: 1 } ntoreturn:100 ntoskip:0 nscanned:4 nscannedObjects:3 keyUpdates:0 writeConflicts:0 numYields:0 nreturned:3 reslen:643 locks:{} 1ms
```

I've tried to use such method in my model:

```
@staticmethod
def search(pattern):
    return Computer.objects.filter(
        db.Q(number=pattern) |
        db.Q(hostname__startswith=pattern) |
        db.Q(macs__startswith=pattern) |
        db.Q(ipmi__startswith=pattern) |
        db.Q(dc__startswith=pattern)
    )
```

But it generates unicode regex (1st query).
Now I'm using different method:

```
@staticmethod
def search(pattern):
  r = bson.regex.Regex('^{}'.format(escape(pattern)))
  return Computer.objects(
      db.Q(number=pattern) |
      db.Q(hostname=r) |
      db.Q(macs=r) |
      db.Q(ipmi=r) |
      db.Q(dc=r)
  )
```

And it works fine (2nd query). But it is terrible, I think.
Also I've tried to use bytestring instead of unicode string, but it handles invalid [here](https://github.com/MongoEngine/mongoengine/blob/master/mongoengine/fields.py#L90-L111).


Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Regex search is extremely slow with python3 #965

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Regex search is extremely slow with python3 #965

Description

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions